tpgillam / mt2

Stransverse mass computation, compatible with numpy.
MIT License
4 stars 3 forks source link

wheels are not correctly built #55

Closed lgray closed 2 weeks ago

lgray commented 3 weeks ago

When I then go to use mt2:

(coffea-dev) lgray@Lindseys-MacBook-Pro dependent-repos-of-coffea % conda create -y --name test_env_5 numpy=1.23.5 python=3.9
(coffea-dev) lgray@Lindseys-MacBook-Pro dependent-repos-of-coffea % conda activate test_env_5
(test_env_5) lgray@Lindseys-MacBook-Pro dependent-repos-of-coffea % pip install mt2
Collecting mt2
  Downloading mt2-1.2.1-cp39-cp39-macosx_11_0_arm64.whl.metadata (7.6 kB)
Requirement already satisfied: numpy>=1.19.3 in /Users/lgray/miniforge3/envs/test_env_5/lib/python3.9/site-packages (from mt2) (1.23.5)
Downloading mt2-1.2.1-cp39-cp39-macosx_11_0_arm64.whl (22 kB)
Installing collected packages: mt2
Successfully installed mt2-1.2.1
(test_env_5) lgray@Lindseys-MacBook-Pro dependent-repos-of-coffea % python
Python 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:55:20) 
[Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
im>>> import mt2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'mt2'
>>> 
(test_env_5) lgray@Lindseys-MacBook-Pro dependent-repos-of-coffea % ls /Users/lgray/miniforge3/envs/test_env_5/lib/python3.9/site-packages
README.txt              distutils-precedence.pth        numpy-1.23.5.dist-info          pkg_resources               wheel
_distutils_hack             mt2-1.2.1.dist-info         pip                 setuptools              wheel-0.44.0.dist-info
_mt2.cpython-39-darwin.so       numpy                   pip-24.2.dist-info          setuptools-72.2.0-py3.12.egg-info

So something's not quite right in the way the wheels are being generated. The mt2 shared object library is being installed into the root of site-packages!

lgray commented 3 weeks ago

@kmohrman @henryiii

tpgillam commented 2 weeks ago

Yep this is broken. The extension module _mt2 is going i the wrong place, but also the python wrapper module mt2 isn't installed at all.

henryiii commented 2 weeks ago

The name of the module is wrong - it should be "mt2._mt2". And the packages need to be discovered or they won't be included.

You should use pipx run build and then list the contents of the created sdist (tar -tf dist/*.tar.gz) and wheel (unzip -l dist/*.whl) to verify packaging is working.

henryiii commented 2 weeks ago

I quickly tried it but got:

src/main.cpp:9:10: fatal error: 'lester_mt2_bisect_v7.h' file not found

Ah, your SDists are broken. Build makes a wheel from the SDist by default.

henryiii commented 2 weeks ago

You can't remove the MANIFEST.in with setuptools, that's still required. Unless you upgrade to a better build system (like scikit-build-core ;) ), you still need MANIFEST.in even with pyproject.toml.

$ pipx run check-manifest
lists of files in version control and sdist do not match!
missing from sdist:
  HISTORY.rst
  Makefile
  examples/mc.ipynb
  src/lester_mt2_bisect_v7.h
  src/mt2_Lallyver2.h
  src/mt2_bisect.h
no MANIFEST.in found; you can run 'check-manifest -c' to create one
suggested MANIFEST.in rules:
  include *.rst
  include Makefile
  recursive-include examples *.ipynb
  recursive-include src *.h
henryiii commented 2 weeks ago

Okay, have a patch. FYI, also getting a couple of probably-easy-to-fix warnings:

In file included from src/main.cpp:10:
src/mt2_Lallyver2.h:536:34: warning: unused variable 'checkdeltasqsq' [-Wunused-variable]
                    const double checkdeltasqsq = checkdeltasq * checkdeltasq;
                                 ^
src/mt2_Lallyver2.h:564:34: warning: unused variable 'newdeltaLBsqsq' [-Wunused-variable]
                    const double newdeltaLBsqsq = newdeltaLBsq * newdeltaLBsq;
                                 ^
2 warnings generated.
lgray commented 2 weeks ago

@henryiii they don't want to change that code in any way because it's some reference version. Perhaps we can use setuptools to apply a patch though without editing the source in github? The warnings are annoying and it's just commenting out those two lines.

henryiii commented 2 weeks ago

Maybe -Wno-unused-variable could be added on Clang and GCC? You probably could even push/pop the warnings when including the file.

henryiii commented 2 weeks ago

FYI, I've added a bit how to check the contents of wheels and SDists to https://scikit-build-core.readthedocs.io/en/latest/build.html.

tpgillam commented 2 weeks ago

The name of the module is wrong - it should be "mt2._mt2"

Strongly worded! I don’t see why it should necessarily be a submodule. I can see preferences to avoid a package having multiple top level modules but I don’t believe it is prohibited.

Not to say it can’t or shouldn’t be changed, but I don’t think this is the cause of the problem. Previously wheels worked fine with this approach.

tpgillam commented 2 weeks ago

you still need MANIFEST.in even with pyproject.toml

thanks, although that’s disappointing!

tpgillam commented 2 weeks ago

@henryiii they don't want to change that code in any way because it's some reference version. Perhaps we can use setuptools to apply a patch though without editing the source in github? The warnings are annoying and it's just commenting out those two lines.

When the wheels are correctly built then, hopefully, we would have binaries for all platforms so no users would see warnings in normal operation.

Otherwise, @kesterlester can comment as to whether he’s happy for changes to be made here (Lally’s mt2 implementation ).

kesterlester commented 2 weeks ago

I have zero experience of the technicalities of python package distribution, wheels ,manifests, tomls and SDists, etc. So as far as all that stuff goes, I leave that to those who know what they are talking about!

But as far as the inner C++ sources go, there I would have a strong preference for the C++ of the original library being preserved unaltered as a reference implementation --- albeit I see no problem with the build tool of a particular packaging library applying a few patches at build time if that makes their distribution easy (or easier) to maintain.

The main reason from my side to keep the C++ source unmodified by concerns due to packaging libraries is so that there is only one core implementation codebase to maintain, tweak, grow, improve, and so that subsequent upgrades to that reference can/will feed through to any distributions which package the reference. After all, although this pypi package is the most common mt2 distribution format today, is has not been so in the past and it may not be so in the future .... so I'm keen that the tail doesn't wag the dog.

Am not 100% sure if this answer addresses the issue I was being asked to comment on. If I've missed the point, please say!

lgray commented 2 weeks ago

Yes, what we are saying is that you can do two things:

That being said, I still don't understand why we are getting so hung up on two unused variables getting commented out. Git can handle this sort of divergence easily. However, it's y'all's code repository and so long as stuff works for my users I'm happy.

henryiii commented 2 weeks ago

Strongly worded! I don’t see why it should necessarily be a submodule. I can see preferences to avoid a package having multiple top level modules but I don’t believe it is prohibited.

I thought that's why it wasn't building at first. The problem actually was that it wasn't including the mt2 python module at all.

Yes, what we are saying is that you can do two things

Three things. The third option is to just turn off these warnings when including the file via compiled specific directives. It's messy but precise and doesn't require the header file to be changed.

kesterlester commented 2 weeks ago

Yes, what we are saying is that you can do two things:

  • have a patch applied that comments the code before building the wheels to remove the warnings

    • this avoids needing to change the code in git history
    • maintenance burden is then on the patch which is easy to generate
  • use a -W flag when compiling the appropriate file

Both the above sound great to me.

  • you can always easily explain to a user why this warning is being ignored, unless you are doing something highly suspicious

  • you can always easily explain to a user why this warning is being ignored, unless you are doing something highly suspicious

That being said, I still don't understand why we are getting so hung up on two unused variables getting commented out. Git can handle this sort of divergence easily. However, it's y'all's code repository and so long as stuff works for my users I'm happy.

Sorry about that! I agree it's a silly thing and we shouldn't get hung up on it.

lgray commented 2 weeks ago

A new release at your soonest convenience would be deeply appreciated. @tpgillam Thanks!

tpgillam commented 2 weeks ago

When I next have a chance to work on this, I'll find a way to test the wheels properly in CI after they've been built. We want this prior to releasing again to ensure that things are definitely not broken this time.

lgray commented 2 weeks ago

So after messing about with things a little, the reason the cibuildwheel tests didn't catch this is because the pythonpath is picking up the mt2 package from the local directory and doesn't actually test the integrity of the wheel as is.

A correcting action would be to put python (and C++) source in a src/ subdir so this accidental import cannot happen. Likewise, having the module built as mt2._mt2 so it follows the python source around when installed is probably helpful in eliminating confounding situations as well (but doesn't directly pertain to this).

lgray commented 2 weeks ago

I can also confirm on my laptop that the current wheel-building setup works correctly and installs the package in a functioning state.