fujiisoup / py3nj

Wigner's 3J, 6J, 9J symbols for python
https://py3nj.readthedocs.io/
Apache License 2.0
18 stars 5 forks source link

Ci for windows and macos #27

Closed fujiisoup closed 1 year ago

fujiisoup commented 1 year ago

CIs for windows and macos are set up. closes #25

Currently, installation for both the environments fails. @kalekundert Do you have an idea? My local windows machine also fails in installation, but with a different symptom...

fujiisoup commented 1 year ago

I think I gonna merge this PR with commenting out macos and windows, next week. The installation failures should be fixed in another PR.

kalekundert commented 1 year ago

Regarding the Mac OS failures, I think this is a relevant discussion: actions/runner-images#3371

It looks like the problem is that the Mac OS Github Actions runners don't have a gfortran executable. Instead, they have different executables for different versions of gfortran, e.g. gfortran-8, gfortran-9, etc. For some reason, meson isn't smart enough to find those. It looks like one solution is to create the appropriate symlinks by hand (as given in the above link). Another solution might be to do something like brew install gcc.

Another similar issue: scipy/scipy#17266

I didn't look that hard, but I couldn't find any examples where the windows build failed instead of being cancelled.


FYI, if you're not aware of it, mxschmitt/action-tmate is a useful tool for debugging CI issues. It gives you access to a shell running on an actual CI node, so you can try things out without having to make a bunch of commits.

kalekundert commented 1 year ago

It might also be useful to copy what numpy does, since they also use meson and presumably had this same problem. On Mac OS, they create a conda environment with the relevant compilers installed. On Windows, they use either use MSVC (which I guess is installed by default) or use choco to install clang.

fujiisoup commented 1 year ago

Thank you, @kalekundert

I didn't look that hard, but I couldn't find any examples where the windows build failed instead of being cancelled.

This was just due to my Github Action's configuration. Now it runs on Windows, but results in an error:

..\..\meson.build:1:0: ERROR: Executables created by fortran compiler gfortran are not runnable.

which I do not understand the meaning.

It might also be useful to copy what numpy does, since they also use meson and presumably had this same problem

Thank you for the suggestion. I looked numpy, but their configuration is huge and I may need a way more time to understand that.

kalekundert commented 1 year ago

I was able to fix the Mac OS build by doing adding compilers and openmp to the ci/requirements/* files, see my branch. While I was at it, I also removed numpy from those files and removed the --no-deps flag from the pip install command, to test that py3nj does properly install its dependencies, but that's not really relevant.

I still don't know what's wrong with the Windows build.

kalekundert commented 1 year ago

I think I know what's going on with Windows, but I don't know how to fix it. The error comes from line 345 of meson/mesonbuild/compilers/clike.py. It means that meson was able to build a simple "sanity check" program, but was unable to run it. It turns out that this has nothing to do with the compiler itself, though. The issue is that meson tried to run the program using a proper Windows path, specifically D:\a\actions-sandbox\actions-sandbox\py3nj\b\meson-private\sanitycheckf.exe. But Github has somehow configured their Windows runners to use Linux-style paths. When I tried to run /d/a/actions-sandbox/actions-sandbox/py3nj/b/meson-private/sanitycheckf.exe, it completed without error.

So I think we either need some way of telling meson to use Linux-style paths despite running on Windows, or some way of telling Github to not do whatever it's doing to make Windows seem more Linux-like.

kalekundert commented 1 year ago

I made some more progress on my branch. I was wrong earlier about there being nothing wrong with the fortran compiler; using the awvwgk/setup-fortran@v1 action to setup a fortran compiler solved the sanity check issue. Now the build completes on windows, but the tests fail with the following message:

c:\hostedtoolcache\windows\python\3.8.10\x64\lib\site-packages\py3nj\wigner.py:2: in <module>
104
    from py3nj import _wigner
105
E   ImportError: DLL load failed while importing _wigner: The specified module could not be found.

I thought this might just require updating the os.add_dll_directory() call at the top of __init__.py, but I wasn't able to get that to work.

fujiisoup commented 1 year ago

Thank you, @kalekundert, for your work. Now the situation is exactly the same with what is happening in my local Windows #26 .

I thought this might just require updating the os.add_dll_directory() call at the top of init.py, but I wasn't able to get that to work.

I think so too. In Python > 3.8 with Windows, dll's are only loaded from trusted locations. If gfortran is installed in user's local directly, it is not automatically loaded, even if gfortran is in PATH.

In my case, if I could add a path for libgomp-1.dll libgfortran-5.dll are added by os.add_dll_directory(), it is successfully imported.

So I wonder whether we could find the path to these dlls automatically when installation.

fujiisoup commented 1 year ago

Looks like this comment in this issue is related.

I think I will try https://github.com/adang1345/delvewheel next.

kalekundert commented 1 year ago

I finally got it to work!! I followed the recommendation from this issue: mesonbuild/meson-python#178

The idea is just to link the extension module statically, so there are no DLLs to worry about. It's not very elegant, but at least it works on CI. It also removes the need for the os.add_dll_directory() call in __init__.py. Let me know if it works on your local Windows machine.

I do think that the delvewheel approach would be more elegant, if you can get it to work.


I think the problem with the where gfortran approach might have been that meson actually uses the $FC variable to find which fortran compiler to use, so you might be not be using the right path. Of course, the $FC variable doesn't necessarily exist after the build process, so we'd probably have to generate some file during the build process with the right paths. It'd be very hacky.

fujiisoup commented 1 year ago

@kalekundert It works great also on my local windows! Thank you for your work.

I think static link is just great for this case. gfortran is just a few MB and might be so stable that we don't expect its significant update.

I really appreciate your contribution.

kalekundert commented 1 year ago

Glad to help! That was way harder than I thought it would be, but hopefully this setup will continue to work for a long time.

If you haven't already thought about it, it might be a good idea to upload the wheels from all 3 operating systems to PyPI. Here are some instructions on how to do this.