Open janden opened 2 years ago
2014 should be mostly good to go, somehow this project was ahead of that (https://github.com/flatironinstitute/cufinufft/issues/93)
Changing to 2014 to would effect a user like myself on older machines which don't pickup 2014 wheels, but this should not be any concern to this project at this point.
Just-in-case, you might be able to adapt the pip calls during wheel building with --prefer-binary
or similar. Something like that would try to use the last published wheel over the latest source release. Maybe if you get a complaint you can use that to release a 2010 package in a pinch...
All in all, changing to 2014 is probably past due.
On a related note I just broke CI in https://github.com/flatironinstitute/cufinufft/commit/b61de35d303c2fb4bf445ee20f11f52e618f7855 by adding in sm_80 to the list of NVARCH in Makefile - this was needed to get working on our A100's here. However it fails on the manylinux:
detected target: manylinux
nvcc --device-c -c -std=c++14 -ccbin=g++ -O3 -arch=sm_80 -gencode=arch=compute_35,code=sm_35 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_86,code=compute_86 -Wno-deprecated-gpu-targets --default-stream per-thread -Xcompiler "-fPIC -O3 -funroll-loops -march=x86-64 -mtune=generic -msse4 -fcx-limited-range -std=c++14" -I/usr/local/cuda/include -Icontrib/cuda_samples -I include src/precision_independent.cu -o src/precision_independent.o
nvcc fatal : Value 'sm_80' is not defined for option 'gpu-architecture'
Do I have to just revert the Makefile, or is there hope of including this? I know nothing about manylinux and not much about cuda / NV arch stuff.
well I reverted the Makefile for now; can add a site make.inc
for SM80.
Hi Alex, I suspect to implement that arch on either of the manylinux packages would require upgrading CUDA in the dockerfiles. Basically I combined the official "manylinux" container with some Nvidia container components to create that platform where the wheels are built.
our manylinux2010 is CUDA 10.1 our manylinux2014 is CUDA 11.0
I believe the CUDA version should be at least 11.1 for that gpu.
Upgrading the CUDA used for packaging would potentially have the side effect of breaking some backward compatibility with older cards, if that matters. That is, the prepackaged binary might not be compatible with someone who has a Maxwell (or older) card (or older driver than we built against). With that said, Maxwell is getting pretty old now... Changing the docker files is totally feasible, but would require some testing.
Hope that helps.
THanks for the explanation. I think we're fine for now, since I added a FI site that has Ampere settings. Hope all's well, and if you're at a loose end look at https://github.com/flatironinstitute/cufinufft/issues/123 Also, I wonder how the multi-GPU tests worked out, or maybe Johannes was doing those? But you're probably not at a loose end :) Cheers, Alex
On Thu, Dec 2, 2021 at 8:46 PM Garrett Wright @.***> wrote:
Hi Alex, I suspect to implement that arch on either of the manylinux packages would require upgrading CUDA in the dockerfiles. Basically I combined the official "manylinux" container with some Nvidia container components to create that platform where the wheels are built.
our manylinux2010 is CUDA 10.1 our manylinux2014 is CUDA 11.0
I believe the CUDA version should be at least 11.1 for that gpu.
Upgrading the CUDA used for packaging would potentially have the side effect of breaking some backward compatibility with older cards, if that matters. That is, the prepackaged binary might not be compatible with someone who has a Maxwell (or older) card (or older driver than we built against). With that said, Maxwell is getting pretty old now... Changing the docker files is totally feasible, but would require some testing.
Hope that helps.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flatironinstitute/cufinufft/issues/122#issuecomment-985145954, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSU77LUA5BTEDEFEFBTUPAOQBANCNFSM5JBNMMMA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- *---------------------------------------------------------------------~^`^~._.~' |\ Alex H. Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942
Ok so are we agreed about moving to manylinux2014
and also to CUDA 11.0, then? As @garrettwrong says, the code is there, so it's mostly a matter of testing.
As discussed in #121, the new version of NumPy only releases binary wheels for manylinux2014, while our CI (and our releases) are for manylinux2010 for now. This is currently resolved by downgrading NumPy to the last version to publish manylinux2010 wheels (1.21), but this is not a tenable solution. One way to approach this would be to stick with manylinux2010 but build NumPy from source during testing. However, the fact that this is becoming a problem suggests that it is maybe time to move to manylinux2014. This should be relatively straightforward to implement, but we have to consider downstream issues. Are any dependent packages going to be severely affected by this?