darshan-hpc / darshan

Darshan I/O characterization tool
Other
55 stars 27 forks source link

MAINT: attempt to move to pyproject.toml #969

Closed shanedsnyder closed 6 months ago

shanedsnyder commented 6 months ago

Inspired by recent CI failures (presumably from setuptools 69.0.0 and PEP621), I tried to move most of our project metadata to pyproject.toml. I'm getting some test failures locally but want to see how CI looks here.

shanedsnyder commented 6 months ago

I'm still looking at this to see if I can move as much as possible out of setup.py and setup.cfg. Feels like I should just move everything while I'm doing this rather than have things spread across multiple files.

My most recent commit simplifies setup.py to only do the C extension as well as manual copying of AutoPerf backend files. I'll think more about those things to see if any other changes are possible and make sense. For one thing, when I pip install Darshan from source on my laptop, it is not automatically copying the AutoPerf files. cibuildwheel, on the other hand, seems to properly copy these files into the wheels. Maybe cibuildwheel was using an older version of setuptools that is using setup.py rather than pyproject.toml?

What's the story on the stuff currently in setup.cfg? Particularly, this stuff:

[bumpversion]
current_version = 3.4.4.0
commit = False
tag = False

[bumpversion:file:setup.py]
search = version='{current_version}'
replace = version='{new_version}'

[bumpversion:file:darshan/__init__.py]
search = __version__ = '{current_version}'
replace = __version__ = '{new_version}'

Is that actually accomplishing anything? AFAIK, we still need to manually modify version numbers at release time? I'd love to automate that, just not sure that's what's happening here? @jakobluettgau did you contribute this originally?

shanedsnyder commented 6 months ago

I spent some more time looking at this, and there are 2 outstanding issues on my end I'd like to better understand before merging. Both of these issues pop up for me locally when trying the steps you would typically take to build wheels -- apparently these are not an issue in our cibuildwheel pipeline, somehow.

  1. If you try to pip install install Darshan from source with PYDARSHAN_BUILD_EXT set (i.e., what we do when building wheels, this forces the build of the C extension), I'm getting lots of warnings then eventually an error:

    /tmp/pip-build-env-w7b31h89/overlay/lib/python3.8/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'darshan.backend' is absent from the `packages` configuration.
    !!
    
          ********************************************************************************
          ############################
          # Package would be ignored #
          ############################
          Python recognizes 'darshan.backend' as an importable package[^1],
          but it is absent from setuptools' `packages` configuration.

    So, building the extension apparently makes it so none of the packages (directories) in PyDarshan are included for install? Any ideas? Anyone else able to trip this locally?

  2. The code that copies AutoPerf modules in setup.py doesn't ever run for me locally, so when I build PyDarshan I have to manually copy them to get them in the install.

jakobluettgau commented 6 months ago

1) Do you have a commit where the pip install . route to installing is working? Because the current head certainly does fail with different errors when PYDARSHAN_BUILD_EXT is set. But having that work again (?) sounds good to me in principle, but then I am not sure if I would make it a blocker if it is broken anyway.

2) Hm, are you sure it is not run? For me it does run. But without initializing and updating git submodule update --init the submodules the inner loop just doesnt have anything to do. I also think I recall situations where the build stage was isolated and this did not work at all, but maybe that behavior changed again or I am mixing something up.

cibuildwheel has the following block which ensures the modules are added though, so that might be why it is not an issue there:

[tool.cibuildwheel.macos]
before-all = [
    "brew install automake",
    "brew install openblas",
    "brew install lapack",
    "git submodule update --init",       # <== here
    "./prepare.sh",
    "./configure --disable-darshan-runtime --enable-apxc-mod --enable-apmpi-mod",
    "make install"
]

3) Regarding "bumpversion", yes that is a little utility that is (or was) used by many projects to automate updating version strings across different places within projects.. e.g., by applying some search/replace rules as we have in the setup.cfg. From a brief investigation there is multiple successors to bumpversion for which development has staled, forks of this have caught up with the revised and more modern build processes though, and there is some that get the version information + rules from the pyproject.toml instead. We'd need to look into which one to go with and if it really is worth it to have this as an additional dev/build dependency. Maybe @tylerjereddy has a recommendation.

4) Other things in setup.cfg

[bdist_wheel]
universal = 1

I think the build-backend = "setuptools.build_meta" already takes care of everything we want for the binary wheels. So this I think we can discard.

[flake8]
exclude = docs
ignore = E501 E231 E265 E303 E271 E272

We are not really using flake8 anymore in favor of mypy so I think we can drop this.

[aliases]
test = pytest

[tool:pytest]
testpaths = tests

Defaults for pytest we can move to pyproject.toml, but these I think we can just discard completely.

shanedsnyder commented 6 months ago

Thanks for the details. Maybe just focusing on this for now for simplicity:

2. Hm, are you sure it is not run? For me it does run. But without initializing and updating git submodule update --init the submodules the inner loop just doesnt have anything to do. I also think I recall situations where the build stage was isolated and this did not work at all, but maybe that behavior changed again or I am mixing something up.

Here's my steps on a clean checkout of Darshan main and a clean venv:

(venv) shane@shane-x1-carbon /tmp/darshan/darshan-util/pydarshan (main) $ git submodule update --init
Submodule 'modules/autoperf' (https://github.com/argonne-lcf/autoperf.git) registered for path '../../modules/autoperf'
Cloning into '/tmp/darshan/modules/autoperf'...
Submodule path '../../modules/autoperf': checked out '8623a0652735785fe87d89514112e43e4f52a78d'
(venv) shane@shane-x1-carbon /tmp/darshan/darshan-util/pydarshan (main) $ pip install . > out
(venv) shane@shane-x1-carbon /tmp/darshan/darshan-util/pydarshan (main) $ ls darshan/backend/
api_def_c.py  cffi_backend.py  __init__.py
(venv) shane@shane-x1-carbon /tmp/darshan/darshan-util/pydarshan (main) $ ls venv/lib64/python3.8/site-packages/darshan/backend/
api_def_c.py  cffi_backend.py  __init__.py  __pycache__
(venv) shane@shane-x1-carbon /tmp/darshan/darshan-util/pydarshan (main) $ ls ../../modules/
autoperf

Note, I changed the build reqs in pyproject.toml to work around the build issues we've been trying to address:

[build-system]
requires = [
    "wheel",
    "setuptools<69.0.0",
]

This has certainly worked for me in the past...no idea what's changed.

shanedsnyder commented 6 months ago

Eh, it seems that pip (or setuptools, whatever) is copying everything in the pydarshan directory into a temporary folder in /tmp before running setup.py. So then, this sort of stuff won't really work as intended as the script is no longer running "inside" the Darshan source repo:

for root, dirs, files in os.walk("../../modules"):

I'm not sure in what circumstances the above would have worked -- I thought it worked for me at some point, but it doesn't now and I don't see any evidence that this sort of behavior ever changed, so maybe I'm not remembering properly.

I really don't understand how the cibuildwheel pipeline handles this without error. The pipeline has logic to check out the submodule as you point out, but that does not copy the submodule files into the pydarshan package. Bizarre.

Ignoring how cibuildwheel is making this work, I really don't see any sane way of handling this reliably. Having your build process have to dig for and manually copy files outside of the package repo is extremely quirky and unsurprisingly not something these tools care about. Having things like setup.py running arbitrary code seems to be falling out of favor anyway, so I'm not sure we should stick with it.

So, I think my inclination is to just do something simple here and stop worrying about this problem. I propose:

  1. We manually checkout the git submodule and copy the files over before building pydarshan.
  2. We update cibuildwheel and all documentation about building from source to make this step clear (really, it's just one additional step of running that make add-modules command after checking out the submodule).
  3. I've already talked with the AutoPerf folks about this, but I think we need to end the experiment of having that stuff in a different git repository. Trying to maintain and build packages for code spanning multiple git repositories is a huge pain. If I can make time to merge that back, then all of this quirkiness goes away for good.
shanedsnyder commented 6 months ago

Oh, shoot. I missed one key step from our github workflow: python -m pip install --upgrade pip

If I run that, then setup.py is able to copy in the AutoPerf files from the submodule no problem -- I guess my default venv is using an older pip version that has different behavior. I can also use @tylerjereddy's hack above to force the install to work properly when enabling the extension module. I think I assumed something else was going on because I did have LD_LIBRARY_PATH set, but it seems this toolchain expects LIBRARY_PATH instead.

In any case, as you guys point out, this is all working on cibuildwheel which is the important part. I mostly wanted to confirm my changes here didn't break that process, and got caught up trying to understand why I was getting different behavior locally. Thanks for having a closer look and steering me in the right direction.

I think I'll still take a stab at reducing setup.cfg and maybe seeing if there's something that could be done related to versioning. At the very least, reducing it down to a single value in the pyproject.toml would be nice.

shanedsnyder commented 6 months ago

Looks like we can just maintain a single value in darshan/__init__.py as outlined by the first option here: https://packaging.python.org/en/latest/guides/single-sourcing-package-version/#single-sourcing-the-package-version

I added that here. Assuming CI passes, I think I'll call this PR done, unless you guys have more feedback.