sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.45k stars 481 forks source link

Add an autoconf configure script for sagelib #29119

Closed embray closed 3 years ago

embray commented 4 years ago

The proposal is to have a configure script specific to sagelib itself. This can reuse most of the work already done and to be done for #27330 in order to detect sagelib's non-Python compilation dependencies at build time (spkg-configure.m4). src/configure.ac would be simpler than the configure.ac for sage-the-distribution, as it would not need to do anything related to SAGE_LOCAL, and (possibly) only needs to detect sagelib's direct dependencies.

This configure script can easily be included in source tarballs for sagelib, and can be called automatically by the setup.py.

For prior art on this, see cysignals which also runs a configure script to detect dependencies for its C sources. While I was at one time skeptical of cysignals' approach to this (it is unusual to make configure part of the build system for a Python package), over time I've found that it works very well in practice.

It uses standard tools that would already be available on systems capable of building sagelib in the first place, and does not introduce any new or out-of-the-ordinary concepts for packagers (aside from having a configure alongside setup.py).

It would explicitly check for sagelib's dependencies in a manner that can be used completely independently of sage-the-distribution.

Related:

Depends on #29039

CC: @dimpase @mkoeppe @jdemeyer @orlitzky

Component: build: configure

Reviewer: Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/29119

embray commented 4 years ago

Dependencies: #29118

embray commented 4 years ago

Description changed:

--- 
+++ 
@@ -10,6 +10,9 @@

 It also has the advantages that it would explicitly check for sagelib's dependencies in a manner that can be used completely independently of sage-the-distribution.

+* **Q:** What files would be generated by this `configure` script?
+  * **A:** This proposal remains agnostic to that question.  It could be anything from header files, plain-text config files, or Python modules under the `sage` package as #29022 does (the difference being this works outside of sage-the-distribution).
+
 * **Q:** How does this impact pip install of sagelib?
   * **A:** In the default case it would "just work": If all of sagelib's dependencies are met by the system then they will be detected on the default paths.  If any dependencies are missing that will be reported (hopefully clearly) and the build will fail.  When pip-installing setuptools-based packages, it is possible to pass additional command-line flags to the underlying `setup.py` call (see e.g. the `--global-option` and `--install-option` flags to `pip install`).  We could easily allow passing passing through additional flags to the underlying `configure` script if/when special customization is needed.
embray commented 4 years ago

Description changed:

embray commented 4 years ago
comment:4

Another thing I should expand on: The proposed configure script would re-use most of the machinery developed for #27330, which would require maybe just some slight adjustments for configuring sage-the-distribution versus configuring a standalone sagelib.

dimpase commented 4 years ago
comment:5

Many distros don't use ./configure to build Sage, so for them it would be quite a change.

mkoeppe commented 4 years ago
comment:6

This is a nice idea that would have a place within the framework of

I see it as follows:

  1. An sdist of sagelib for PyPI would need to configure itself in some way. An autoconf-generated configure script could be a solution (though I seem to remember that in previous discussions this idea has met resistance -- not from me, I'm obviously in favor of autotools). As you say, this should be able to reuse the build/pkgs/SPKG/spkg-configure.m4 infrastructure.

    However, I don't think it is actionable at the moment because sagelib is not modular enough and depends on too many packages. Existing efforts such as using Features for sagelib configuration / modularization seem to have stalled. (by the way - #25828 needs help).

  2. The build of sagelib within sage-the-distribution should definitely not run another round of "configure" because everything has already been determined.

  3. My guess is that downstream packagers would prefer configuring sagelib statically - because some distribution-specific things need to be set manually anyway.

So the way I see it, there are multiple ways how sagelib should get its build-time (and run-time) configuration. This is exactly what #29038 (sage_conf) addressed. It defined one interface of reading a configuration (from shell and from Python) and provided the reference implementation for case 2 (build of sagelib within sage-the-distribution). This encourages multiple implementations for the multiple circumstances in which sagelib is built.

You could provide a new implementation of sage_conf for case 1 (PyPI), which would be distributed with an sdist of sagelib, using the autoconf approach.

Downstream packagers are invited to use their own implementation of sage_conf for case 3. This has not happened yet -- after all, so far there is no release that has sage_conf.

mkoeppe commented 4 years ago

Changed dependencies from #29118 to none

orlitzky commented 4 years ago
comment:8

+1, I was just going to suggest this on #29118.

The script for sagelib should do in-depth checks for the things that sagelib uses, and would be able to e.g. detect the path to nauty executables and substitute them into graph_generators.py.in, creating graph_generators.py in the process. Packagers would have no problem with this, and would save us a lot of trouble trying to work around the seven different ways that we've wound up packaging sage due to how awkward it was to do historically.

The top-level distribution script would then check for things the distribution needs (like sqlite3 on the command line), and only needs to do very basic feature tests.

mkoeppe commented 4 years ago

Description changed:

--- 
+++ 
@@ -1,17 +1,15 @@
-This is an alternative proposal to #29038.  It's also something that was in the back of my mind when starting work on #27330, but that I never fully articulated.
-
-The proposal is to have a `configure` script specific to sagelib itself.  This can reuse most of the work already done and to be done for #27330 in order to detect sagelib's non-Python compilation dependencies at build time.    It would be simpler than the `configure.ac` for sage-the-distribution, as it would not need to do anything related to SAGE_LOCAL, and (possibly) only needs to detect sagelib's direct dependencies.
+The proposal is to have a `configure` script specific to sagelib itself.  This can reuse most of the work already done and to be done for #27330 in order to detect sagelib's non-Python compilation dependencies at build time (`spkg-configure.m4`).  `src/configure.ac` would be simpler than the `configure.ac` for sage-the-distribution, as it would not need to do anything related to `SAGE_LOCAL`, and (possibly) only needs to detect sagelib's direct dependencies.

 This `configure` script can easily be included in source tarballs for sagelib, and can be called automatically by the `setup.py`.

 For prior art on this, see [cysignals](https://github.com/sagemath/cysignals) which also runs a `configure` script to detect dependencies for its C sources.  While I was at one time skeptical of cysignals' approach to this (it is unusual to make `configure` part of the build system for a Python package), over time I've found that it works very well in practice.

-I believe that this would solve the same problems #29038 is trying to solve, and better: It uses standard tools that would already be available on systems capable of building sagelib in the first place, and does not introduce any new or out-of-the-ordinary concepts for packagers (aside from having a `configure` alongside `setup.py`).
+It uses standard tools that would already be available on systems capable of building sagelib in the first place, and does not introduce any new or out-of-the-ordinary concepts for packagers (aside from having a `configure` alongside `setup.py`).

-It also has the advantages that it would explicitly check for sagelib's dependencies in a manner that can be used completely independently of sage-the-distribution.
+It would explicitly check for sagelib's dependencies in a manner that can be used completely independently of sage-the-distribution.

 * **Q:** What files would be generated by this `configure` script?
-  * **A:** This proposal remains agnostic to that question.  It could be anything from header files, plain-text config files, or Python modules under the `sage` package as #29022 does (the difference being this works outside of sage-the-distribution).
+  * **A:** This proposal remains agnostic to that question.  It could be anything from header files, plain-text config files (as proposed in #22652 in 2017), shell scripts (such as `src/bin/sage-env-config` implemented in #21479 in 2016/17, currently written by `SAGE_ROOT/configure`) or Python modules under the `sage` package (such as `sage.env_config`, like #29022 had proposed), or the Python module `sage_conf` next to the `sage` module (as #29038 implemented).

 * **Q:** How does this impact pip install of sagelib?
   * **A:** In the default case it would "just work": If all of sagelib's dependencies are met by the system then they will be detected on the default paths.  If any dependencies are missing that will be reported (hopefully clearly) and the build will fail.  When pip-installing setuptools-based packages, it is possible to pass additional command-line flags to the underlying `setup.py` call (see e.g. the `--global-option` and `--install-option` flags to `pip install`).  We could easily allow passing passing through additional flags to the underlying `configure` script if/when special customization is needed.
@@ -19,3 +17,9 @@
 * **Q:** How does this impact installing sagelib in a virtualenv/venv?
   * **A:** Basically not at all/trivially: If the dependencies are already met by the system then this proceeds just as installing directly into the system site-packages.  If some of the dependencies are installed in the virtualenv itself, and we want to use them, it is necessary to pass additional `C(PP)FLAGS` and `LDFLAGS` so that the virtualenv is used as a header/library search path.  This is true of any package that has dependencies in a non-standard prefix.
     * In the virtualenv case we could in principle handle `CFLAGS` and `LDFLAGS` automatically by checking for the `VIRTUALENV` environment variable, but this is a separate enhancement.
+
+
+Related:
+- #21707: Meta-ticket: Split sage-env into 5 to clean up sage configuration
+- #21507: Task ticket: Make sagelib a pip-installable Python source package, listed on PyPI
+
embray commented 4 years ago
comment:10

This still makes sense, because having a configure script for sagelib which is run by setup.py would render this sage_conf script unnecessary.

embray commented 4 years ago

Dependencies: #29118

embray commented 4 years ago
comment:11

Replying to @dimpase:

Many distros don't use ./configure to build Sage, so for them it would be quite a change.

That's the sage-the-distribution ./configure. Of course they don't use it, because that's for building an entire software distribution which is obviously moot when packaging sage for their distribution. This is about adding a ./configure script just for sagelib itself, under src/. It would do much of the same dependency detection, but would be entirely independent of sage-the-distribution (references to "$SAGE_LOCAL", if there even are any, would just be an alias for $prefix in this case).

embray commented 4 years ago
comment:12

Replying to @mkoeppe:

This is a nice idea that would have a place within the framework of

  • 21707: Meta-ticket: Split sage-env into 5 to clean up sage configuration

  • 21507: Task ticket: Make sagelib a pip-installable Python source package, listed on PyPI.

I see it as follows:

  1. An sdist of sagelib for PyPI would need to configure itself in some way. An autoconf-generated configure script could be a solution (though I seem to remember that in previous discussions this idea has met resistance -- not from me, I'm obviously in favor of autotools). As you say, this should be able to reuse the build/pkgs/SPKG/spkg-configure.m4 infrastructure.

I think I resisted it at first because it is a bit unusual. But that's why I brought of Cysignals. I was skeptical of it there too, but after a couple years of using it I've found it works well in practice.

embray commented 4 years ago
comment:13

Replying to @mkoeppe:

However, I don't think it is actionable at the moment because sagelib is not modular enough and depends on too many packages. Existing efforts such as using Features for sagelib configuration / modularization seem to have stalled. (by the way - #25828 needs help).

Yeah, it's still a bit of a mess, but actually not too bad. This would only be for detecting dependencies needed to build and link sagelib's extension modules.

This would require slight reworking of how OptionalExtension works (which is needed anyways for the goal of #21507). Currently OptionalExtension works by checking if some optional SPKGs required for the extension are installed. Instead it should divorce sagelib from the notion of SPKGs per-se, and just detect at configure-time what dependencies are available. I'm agnostic to how configure records that (maybe just a list of detected dependencies that have the same name as the SPKGs?), which OptionalExtension can read to see whether or not to build that extension.

matplotlib works much the same way, except I don't think it uses autoconf.

  1. The build of sagelib within sage-the-distribution should definitely not run another round of "configure" because everything has already been determined.

Maybe, maybe not. It would actually be good if, when building sagelib, it is installed just like any other package, including re-running its configure script, which would not be the same as the sage-the-distribution configure script. We can certainly make better use of caching to speed this up, but it's not that onerous. It would not need to do as much as the sage-the-distribution configure.

  1. My guess is that downstream packagers would prefer configuring sagelib statically - because some distribution-specific things need to be set manually anyway.

So the way I see it, there are multiple ways how sagelib should get its build-time (and run-time) configuration. This is exactly what #29038 (sage_conf) addressed. It defined one interface of reading a configuration (from shell and from Python) and provided the reference implementation for case 2 (build of sagelib within sage-the-distribution). This encourages multiple implementations for the multiple circumstances in which sagelib is built.

Right, and this would not need "multiple implementations" because autoconf is a (mostly) platform agnostic tool used to configure packages on their target platform.

You could provide a new implementation of sage_conf for case 1 (PyPI), which would be distributed with an sdist of sagelib, using the autoconf approach.

But would you even need sage_conf anymore? I thought you might like this because it's closer to your original approach of generating a file that goes directly into the sage package. If you want it to be a .py file it can do that. In fact, using autoconf we can generate any number of files (.py, .env, etc.) using the same information gathered by configure.

Downstream packagers are invited to use their own implementation of sage_conf for case 3. This has not happened yet -- after all, so far there is no release that has sage_conf.

This would obviate the need for them to and likely make their lives easier.

embray commented 4 years ago
comment:14

Replying to @orlitzky:

+1, I was just going to suggest this on #29118.

The script for sagelib should do in-depth checks for the things that sagelib uses, and would be able to e.g. detect the path to nauty executables and substitute them into graph_generators.py.in, creating graph_generators.py in the process. Packagers would have no problem with this, and would save us a lot of trouble trying to work around the seven different ways that we've wound up packaging sage due to how awkward it was to do historically.

Yes, this is the right idea in general. In the case of graph_generators.py though I would not do this. It would be much simpler and more manageable if all the dependencies detected by configure were output to a single file, such as @mkoeppe's sage/env_config.py.in from #29022, and individual sub-modules like graph_generators.py can just get what they need from sage.env.

I originally rejected that idea on two grounds:

  1. It was too dependent on sage-the-distribution to create sage/env_config.py from sage/env_config.py.in. Packagers (including a sagelib sdist) would need their own ad-hoc solutions for generating this file.

  2. I felt making a python module was unnecessary, and that it would be more useful as a simple list of KEY=value pairs that can be read e.g. by shell scripts.

Of these, 2. is more of a bikeshed--I think it's fine if it's a .py module, even if it's a very simple one.

Issue 1. I feel was more important, and would be addressed by this proposal, I think making for a good compromise.

And as I mentioned above, if there was a need to create a plain-text file with Sage configuration, that could also be just as easily output by a configure script alongside the .py file.

The top-level distribution script would then check for things the distribution needs (like sqlite3 on the command line), and only needs to do very basic feature tests.

For now it would still need to do more than that, because it is needed to build the entire sage-the-distribution. So it needs to decide what SPKGs to install.

The proposed configure for sagelib (which in the current repository layout would be under src/configure) would have nothing to do directly with SPKGs. It would only detect the presence of the dependencies, and not do anything about installing them if they're missing. The only relation it would have to SPKGs, is that in generating this configure script we can re-use all the checks we already implemented in the spkg-configure.m4 files.

dimpase commented 4 years ago
comment:15

OptionalExtension? Aren't we getting rid of it all together? (cf. #28815)

embray commented 4 years ago
comment:16

Replying to @dimpase:

OptionalExtension? Aren't we getting rid of it all together? (cf. #28815)

Sure, either way. I think this would somewhat supersede #25828, because instead the configure script would be needed to detect build-time dependencies. I would envision Features being used only for runtime optional dependencies.

Whatever the case you need way to enable/disable certain extension modules depending on what build dependencies are detected, but using the results of a configure script to do this would be quite easy and standard practice.

embray commented 4 years ago
comment:18

One thing I've long wanted is a kind of pure-Python autoconf replacement, for configuring non-trivial Python packages. Many packages like Numpy, Scipy, matplotlib, and others all have their own ad-hoc code for doing essentially the same things, and it would be lovely to have a standard package for that.

This is more-or-less what SCons is, yet most of these projects don't use it, and I'm curious why.

Anyways, we now already have a nearly complete set of autoconf macros for checking Sage's build dependencies, so we might as well use autoconf.

orlitzky commented 4 years ago
comment:19

Replying to @embray:

This is more-or-less what SCons is, yet most of these projects don't use it, and I'm curious why.

https://wiki.gentoo.org/wiki/SCons#Why_you_should_NOT_use_SCons_in_your_project

IMO it tries to replace autotools, but does everything worse than autotools. You might as well just use autotools. Writing M4 sucks, but after you get everything written and ship a release or two with some missing files because you forgot to add dist_ to the name, everything works pretty well.

dimpase commented 4 years ago
comment:20

Replying to @orlitzky:

Replying to @embray:

This is more-or-less what SCons is, yet most of these projects don't use it, and I'm curious why.

https://wiki.gentoo.org/wiki/SCons#Why_you_should_NOT_use_SCons_in_your_project

cmake has a much better marketing department :-)

Besides scons has a reputation for being very slow. (iirc it is a full build tool, not like autoconf and cmake, which actually produce makefiles for make or similar tools)

IMO it tries to replace autotools, but does everything worse than autotools. You might as well just use autotools. Writing M4 sucks, but after you get everything written and ship a release or two with some missing files because you forgot to add dist_ to the name, everything works pretty well.

embray commented 4 years ago
comment:21

Replying to @orlitzky:

Replying to @embray:

This is more-or-less what SCons is, yet most of these projects don't use it, and I'm curious why.

https://wiki.gentoo.org/wiki/SCons#Why_you_should_NOT_use_SCons_in_your_project

IMO it tries to replace autotools, but does everything worse than autotools. You might as well just use autotools. Writing M4 sucks, but after you get everything written and ship a release or two with some missing files because you forgot to add dist_ to the name, everything works pretty well.

That's pretty much more-or-less what I've heard as well; I never tried building a project with it myself, but have heard mostly frustration about it (even though I do think it's a good idea in principle). Alas, for all its faults autoconf is battle-tested.

And since we already have increasingly robust M4 macros for testing sage's build dependencies it seems like the right choice here.

embray commented 4 years ago
comment:22

I'm working on a prototype for this, but I have other work priorities that are taking precedence.

mkoeppe commented 3 years ago

Changed dependencies from #29118 to #29039

mkoeppe commented 3 years ago
comment:24

29039 does this via a version of sage_conf

dimpase commented 3 years ago

Reviewer: Dima Pasechnik