easybuilders / easybuild-framework

EasyBuild is a software installation framework in Python that allows you to install software in a structured and robust way.
https://easybuild.io
GNU General Public License v2.0
147 stars 200 forks source link

`--try-toolchain` still complains about missing dependencies #3996

Open eburgueno opened 2 years ago

eburgueno commented 2 years ago

I am trying to install R-4.1.0 using the latest foss toolchain (2021b). This is primarily because I already installed R-4.1.2 and I want to reuse dependencies as much as possible. Currently, EasyBuild complains that some dependencies are missing, even though presumably --try-toolchain should attempt to build them recursively if --robot is used (shouldn't it?):

$ eb -r --try-toolchain foss,2021b -D R-4.1.0-foss-2021a.eb
== Temporary log file in case of crash /tmp/eb-e0fwy6sw/easybuild-2x8c1uzi.log
== found valid index for /software/EasyBuild/4.5.4/easybuild/easyconfigs, so using it...
== found valid index for /software/EasyBuild/4.5.4/easybuild/easyconfigs, so using it...
ERROR: Missing dependencies: Python/3.9.5-GCCcore-11.2.0, Autotools/20210128-GCCcore-11.2.0, SQLite/3.35.4-foss-2021b, expat/2.2.9-GCCcore-11.2.0, Python/3.9.5-GCCcore-11.2.0-bare, CMake/3.20.1-GCCcore-11.2.0, cURL/7.76.0-foss-2021b, FFTW/3.3.9-foss-2021b, Python/3.9.5-foss-2021b, expat/2.2.9-foss-2021b, Autotools/20210128-gompi-2021b, CMake/3.20.1-gompi-2021b, cURL/7.76.0-gompi-2021b, SQLite/3.35.4-GCCcore-11.2.0, cURL/7.76.0-GCCcore-11.2.0 (no easyconfig file or existing module found)

Additional information:

$ eb --show-config
#
# Current EasyBuild configuration
# (C: command line argument, D: default value, E: environment variable, F: configuration file)
#
allow-modules-tool-mismatch (F) = True
buildpath                   (F) = /dev/shm
configfiles                 (E) = /software/powerPlant/etc/easybuild.cfg
containerpath               (D) = /home/user1/.local/easybuild/containers
detect-loaded-modules       (F) = purge
installpath                 (D) = /home/user1/.local/easybuild
installpath-modules         (F) = /software/powerPlant/modulefiles
installpath-software        (F) = /software
job-cores                   (F) = 4
module-syntax               (F) = Tcl
modules-tool                (F) = EnvironmentModules
repositorypath              (D) = /home/user1/.local/easybuild/ebfiles_repo
robot-paths                 (D) = /software/EasyBuild/4.5.4/easybuild/easyconfigs
sourcepath                  (D) = /home/user1/.local/easybuild/sources

Build log attached: easybuild-2x8c1uzi.log.gz

casparvl commented 2 years ago

Hm, I was also under the impression that EasyBuild should realize what the toolchain hierarchy is, and simply recursively do a eb Python-3.9.5-GCCcore-10.3.0.eb --try-toolchains=GCCcore,11.2.0, but it seems to get confused about R being at foss level, and the dependencies at GCCcore level. For example, running

eb --try-toolchain=GCCcore,11.2.0 IPython-7.25.0-GCCcore-10.3.0.eb

(which also has Python 3.9.5 as a dependency) runs correctly and generates a tweaked EasyConfig for that Python:

* [ ] /scratch-shared/casparl/eb-ak2p_770/tweaked_dep_easyconfigs/Python-3.9.5-GCCcore-11.2.0.eb (module: Python/3.9.5-GCCcore-11.2.0)

without issues, whereas

eb SciPy-bundle-2021.05-foss-2021a.eb -D --disable-minimal-toolchains --try-toolchain=foss,2021b

runs into the same issue you had.

Anyway, regardless of the cause, as you mentioned you already have R-4.1.2 installed and want to reuse dependencies, I'm not sure if --try-toolchain is what you want to do here. E.g. R-4.1.2 is using SQLite version 3.36 as dependency. If you would do

eb -r --try-toolchain foss,2021b -D R-4.1.0-foss-2021a.eb

then you're trying to build an R-4.1.0 based on an EasyConfig that uses SQLite version 3.35 as dependency. I.e. --try-toolchain=X only changes the toolchain, it does not update the dependency versions to correspond with versions used in tha new toolchain X. What you probably want to do is:

eb -r --try-toolchain foss,2021b -D R-4.1.0-foss-2021a.eb --try-update-deps --experimental

The --try-toolchain here will bump the toolchain version, and then the --try-update-deps will try to bump the versions of dependencies to those that are used in the new choolchain. I.e. in this example, it would check which SQLite is the common version used in foss-2021b, find that it is 3.36, and use that. That would enable you to reuse the dependencies from R-4.2.1 as much as possible. Note that this is still an experimental feature though. Unfortunately, I tried to run it with the above command, and it still hit an error in doing a version comparison (TypeError: '<' not supported between instances of 'int' and 'str'). I'm afraid that in this case all that's left is to take the R-4.1.0-foss-2021a.eb, copy it, manually update the toolchain, and manually update the versions of the dependencies to match those in R-4.2.1-foss-2021b.eb...

eburgueno commented 2 years ago

but it seems to get confused about R being at foss level, and the dependencies at GCCcore level

I think that might be it indeed. I haven't tried other combinations, but this makes sense.

you're trying to build an R-4.1.0 based on an EasyConfig that uses SQLite version 3.35 as dependency

I am probably ok with that. I was more wanting to avoid needing to have multiple versions of the same toolchain and standardise everything on foss,2021b (and GCCcore,11.2.0).

Unfortunately, I tried to run it with the above command, and it still hit an error in doing a version comparison (TypeError: '<' not supported between instances of 'int' and 'str')

It might work setting EB_PYTHON to use python 2 instead, but I haven't tried it.

casparvl commented 2 years ago

From my logs (with --debug):

== 2022-04-22 14:23:46,549 filetools.py:1126 INFO Index found for /sw/noarch/Centos8/2021/software/EasyBuild/4.5.4/easybuild/easyconfigs, so using it...
== 2022-04-22 14:23:46,564 build_log.py:265 INFO Searching (case-insensitive) for '^Java-.*.*.eb' in /scratch-shared/casparl/eb-gewsnv3u/tweaked_dep_easyconfigs
== 2022-04-22 14:23:46,564 filetools.py:1121 INFO No index found for /scratch-shared/casparl/eb-gewsnv3u/tweaked_dep_easyconfigs, creating one...
== 2022-04-22 14:23:46,565 parser.py:74 DEBUG Obtained parameters value for ['toolchain']: ["{'name': 'dummy', 'version': 'dummy'}"]
== 2022-04-22 14:23:46,565 parser.py:74 DEBUG Obtained parameters value for ['toolchain']: ['SYSTEM']
== 2022-04-22 14:23:46,565 parser.py:74 DEBUG Obtained parameters value for ['toolchain']: ["{'name': 'dummy', 'version': 'dummy'}"]
== 2022-04-22 14:23:46,565 parser.py:74 DEBUG Obtained parameters value for ['toolchain']: ["{'name': 'dummy', 'version': 'dummy'}"]
== 2022-04-22 14:23:46,566 parser.py:74 DEBUG Obtained parameters value for ['toolchain']: ["{'name': 'dummy', 'version': 'dummy'}"]
...
== 2022-04-22 14:23:46,585 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ['1.8.0_241', None]
== 2022-04-22 14:23:46,585 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ['1.8.0_271', None]
== 2022-04-22 14:23:46,585 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ['1.8.0_281', None]
== 2022-04-22 14:23:46,586 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ["1.%s.0_%s' % (local_java_version, local_patch_version)", '-OpenJDK']
== 2022-04-22 14:23:46,586 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ['1.8.0_311', None]
== 2022-04-22 14:23:46,586 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ['1.8', None]
== 2022-04-22 14:23:46,586 parser.py:74 DEBUG Obtained parameters value for ['version', 'versionsuffix']: ['1.8_191', '-b26-OpenJDK']

Maybe the ["1.%s.0_%s' % (local_java_version, local_patch_version)", '-OpenJDK'] is the issue: if the string replacement doesn't happen before the version comparison, this would result in the comparison between a string and an int...

casparvl commented 2 years ago

I think there are actually two problems here. First, it seems that fetch_parameters_from_easyconfig on this line simply returns the version and versionsuffix based on a regex. That means string replacements will not have been made. I.e. it may return

"1.%s.0_%s' % (local_java_version, local_patch_version)"

As a version. Clearly, python has no idea how to compare that to e.g. a version 1.8.0_281, as it would compare the '8' to the '%s'.

The EasyConfigs with string replacements in the version aren't the only ones that cause issues. There's also e.g. Java-1.8_191-b26-OpenJDK.eb. The version there is 1.8_191. Again, LooseVersion has no idea how to do a comparison between this and e.g. 1.8.0_281 (which one is supposed to be 'greater'?), as they are simply two differen versioning schemes.

After changing all of the related Java EasyConfigs, my

eb -r --try-toolchain foss,2021b -D R-4.1.0-foss-2021a.eb --try-update-deps --experimental

then completed succesfully. However, I think we should implement some try-except logic in tweak.py of what to do when it fails to compare two version (probably: warn the user, but proceed in the hope that if the versions are not comparable, that potential candidate is probably also not viable). The other thing would be to make fetch_parameters_from_easyconfig more intelligent and actually parse the EasyConfig (instead of Regex matching), so that string replacements are made. But, that might have other, unforeseen implications...

robogast commented 1 year ago

As a fast-and-loose comment; I also run into the same issues: e.g. String substitution not being done when comparing versions, resulting in the TypeError: '<' not supported between instances of 'int' and 'str' error.