easybuilders / easybuild-easyconfigs

A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
380 stars 703 forks source link

suggestion: always use PythonBundle instead of PythonPackage #15639

Open smoors opened 2 years ago

smoors commented 2 years ago

for Python packages we use easyblock PythonPackage, unless it's more than 1 package, then we use PythonBundle.

this is annoying:

so my suggestion: why not always use PythonBundle from the start? is there any downside?

see also https://github.com/easybuilders/easybuild-easyconfigs/pull/15491

boegel commented 2 years ago

As far as I know from the top of my head, the installation of an easyconfig that uses PythonBundle with a single extension, or using PythonPackage directly, should be identical, both in terms of contents of the installation directory and the module file.

@smoors Are you up for verifying that's indeed the case first, before we consider actively recommending (and maybe even requiring it in PRs to the central easyconfigs repo) to use only PythonBundle even if there's only a single Python package being installed?

I guess that would be a bit awkward, but there's no technical reason not to do so that I can think of...

smoors commented 2 years ago

here are the differences in the module file. there are 3 main differences:

--- /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pandas/modules/all/sklearn-pandas/2.2.0-foss-2021b.lua     2022-06-09 13:54:16.486328000 +0200
+++ /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pd/modules/all/sklearn-pandas/2.2.0-foss-2021b.lua 2022-06-09 13:46:04.393136000 +0200
@@ -10,6 +10,11 @@
 More information
 ================
  - Homepage: https://github.com/scikit-learn-contrib/sklearn-pandas
+
+
+Included extensions
+===================
+sklearn-pandas-2.2.0
 ]==])

 whatis([==[Description: 
@@ -18,8 +23,9 @@
 columns to transformations, which are later recombined into features.]==])
 whatis([==[Homepage: https://github.com/scikit-learn-contrib/sklearn-pandas]==])
 whatis([==[URL: https://github.com/scikit-learn-contrib/sklearn-pandas]==])
+whatis([==[Extensions: sklearn-pandas-2.2.0]==])

-local root = "/scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pandas/software/sklearn-pandas/2.2.0-foss-2021b"
+local root = "/scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pd/software/sklearn-pandas/2.2.0-foss-2021b"

 conflict("sklearn-pandas")

@@ -36,9 +42,11 @@
 end

 prepend_path("CMAKE_PREFIX_PATH", root)
+prepend_path("LIBRARY_PATH", pathJoin(root, "lib"))
 setenv("EBROOTSKLEARNMINPANDAS", root)
 setenv("EBVERSIONSKLEARNMINPANDAS", "2.2.0")
 setenv("EBDEVELSKLEARNMINPANDAS", pathJoin(root, "easybuild/sklearn-pandas-2.2.0-foss-2021b-easybuild-devel"))

 prepend_path("PYTHONPATH", pathJoin(root, "lib/python3.9/site-packages"))
 -- Built with EasyBuild version 4.5.5.dev0-r8f014a8f1243272fef46c07b8a9341b1b86f6dbf
+setenv("EBEXTSLISTSKLEARNMINPANDAS", "sklearn-pandas-2.2.0")
smoors commented 2 years ago

in the install dir, the changes are very minimal:

--- /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pandas/software/sklearn-pandas/2.2.0-foss-2021b/lib/python3.9/site-packages/sklearn_pandas-2.2.0.dist-info/direct_url.json>2022-06-09 13:53:52.512420000 +0200
+++ /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pd/software/sklearn-pandas/2.2.0-foss-2021b/lib/python3.9/site-packages/sklearn_pandas-2.2.0.dist-info/direct_url.json>2022-06-09 13:45:22.299471000 +0200
@@ -1 +1 @@
-{"dir_info": {}, "url": "file:///tmp/sklearnpandas/2.2.0/foss-2021b/sklearn-pandas-2.2.0"}
\ No newline at end of file
+{"dir_info": {}, "url": "file:///tmp/sklearnpandas/2.2.0/foss-2021b/sklearnpandas/sklearn-pandas-2.2.0"}
\ No newline at end of file
diff -ur /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pandas/software/sklearn-pandas/2.2.0-foss-2021b/lib/python3.9/site-packages/sklearn_pandas-2.2.0.dist-info/RECORD /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pd/software/sklearn-pandas/2.2.0-foss-2021b/lib/python3.9/site-packages/sklearn_pandas-2.2.0.dist-info/RECORD
--- /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pandas/software/sklearn-pandas/2.2.0-foss-2021b/lib/python3.9/site-packages/sklearn_pandas-2.2.0.dist-info/RECORD>-2022-06-09 13:53:52.514923000 +0200
+++ /scratch/brussel/vo/000/bvo00005/vsc10009/ebtest/sklearn-pd/software/sklearn-pandas/2.2.0-foss-2021b/lib/python3.9/site-packages/sklearn_pandas-2.2.0.dist-info/RECORD>-2022-06-09 13:45:22.302271000 +0200
@@ -4,7 +4,7 @@
 sklearn_pandas-2.2.0.dist-info/RECORD,,^M
 sklearn_pandas-2.2.0.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0^M
 sklearn_pandas-2.2.0.dist-info/WHEEL,sha256=Z-nyYpwrcSqxfdux5Mbn_DQ525iP7J2DG3JgGvOYyTQ,110^M
-sklearn_pandas-2.2.0.dist-info/direct_url.json,sha256=DPNhpA3gZTy5m8eyLDhUVeZoIGkOxmnXUBBnQn2x4Tk,90^M
+sklearn_pandas-2.2.0.dist-info/direct_url.json,sha256=N3RwyHpoG2H3qm-1a53hFoz1bBBTj5GVRHclw9xulPU,104^M
 sklearn_pandas-2.2.0.dist-info/top_level.txt,sha256=qwEWyA_CKkNRd16IWPhIuKZbfaHg0bcCRp2urjCBbZ0,15^M
 sklearn_pandas/__init__.py,sha256=YxXBJOGjCK312lz46J0yaPqz5aMPV_ARMw3Jz_Sptzo,237^M
 sklearn_pandas/__pycache__/__init__.cpython-39.pyc,,^M
boegel commented 2 years ago

@smoors As briefly discussed during the EasyBuild conf call this week, I can see the annoyance with PythonPackage vs PythonBundle, but I'm also wondering how we should move forward if we agree on recommending to only use PythonBundle going forward (assuming there's agreement on that between maintainers, I would like to hear a couple of additional voices in here @easybuilders/easybuild-easyconfigs-maintainers...)

Shall we start making the easyconfigs test suite fail for easyconfigs that use PythonPackage directly, and ask to switch to PythonBundle instead? If so, we should include the URL of a docs page that explains the motivation behind this...

branfosj commented 2 years ago

I'd prefer us to use PythonBundle. However, I am concerned about how we make it happen. I do not agree with making it required - at least not quickly, as it adds an extra barrier to contributions.

smoors commented 2 years ago

we could also provide a script that automatically converts PythonPackage easyconfigs to PythonBundle? or even run the script whenever a PythonPackage easyconfig PR is created?

boegel commented 2 years ago

If we make it required for the central easyconfig repository, we should only enforce it for easyconfigs being touched in new PRs, by adding a test to the easyconfigs test suite. Before enforcing the use of PythonBundle, we should add a page to the documentation to explain why we're doing this, and providing some clear guidelines for to convert an easyconfig using PythonPackage to PythonBundle (or implement a script or even an option in EasyBuild itself to do the conversion)