Closed RaulPPelaez closed 7 months ago
The CI does not have a GPU available, so the only thing it can do is to check that compilation succeeds and that it is possible to import the different solvers from Python (which does not require CUDA).
You can check the output of the CI script by clicking on "Actions" in the top menu. You can also see the run for the latest commit in this PR at the bottom here by clicking on Details:
@rykerfish It would be great if you could review and approve this PR.
I (understandably) don't have push permissions to this branch, so I'll detail the issues here.
import libMobility.Solver
specifically to access any of the solvers (e.g. import libMobility.SelfMobility
). When constructing the solver, it seemed like the constructor had to be explicitly called using libMobility.SelfMobility.SelfMobility(args)
which led to a repetitive naming convention. Was there a better way around this that I didn't figure out?For the python example, NBody would give floating point exceptions in the call to MDot. For PSE, I encountered the error below.
RuntimeError: CudaCheckError() failed at /home/rfish/school/codes/libMobility/third_party/uammd/src/utils/ParticleSorter.cuh:234: the provided PTX was compiled with an unsupported toolchain. - code: 222
On the c++ side, it would randomly pick between one of two different errors. The first was an out of memory error from CUDA, and the second being shown below:
terminate called after throwing an instance of 'thrust::system::system_error' what(): __copy::trivial_device_copy H->D: failed: cudaErrorInvalidValue: invalid argument
All of the above could absolutely be an issue with my installation (nvcc is version 12.3), but I have been using older versions of libMobility for Brennan's fiber-related projects that gives me some assurance it's working.
I added you as a maintainer, I thought I already did! You should have push access to this branch now. But ofc feel free to just open a new PR with my commits.
If updating the driver is not a possibility then you need to use a previous version of CUDA. Conda-forge does not expose the convenient cuda-nvcc and such packages for CUDA<12. You would have to either change to cudatoolkit-dev, use the nvidia channel or just use the system's CUDA installation.
Thanks for that- I had forgotten to update my drivers when I updated from 12.3 to 12.4. Updating those made most things good to go.
One note- the NBody example in python can still cause exceptions. Setting NperBatch to -1 can leave NperBatch uninitialized since it defaults to the number of particles. However, if initialize hasn't been called (and it hasn't been yet in example.py), then the number of particles is uninitialized as well, leaving NperBatch as a random value. Getting random values like 0 or larger than memory caused me some floating point exceptions or out of memory errors.
I'm still having unknown issues with the c++ example, which I'll look into more soon.
Now that things are mostly working on my side, I plan to experiment and get more used to using libMobility. I'll keep notes so I can ask you any questions that come up, and hopefully I can use the notes to start working on documentation as per #6 if that sounds good to you!
@RaulPPelaez Sorry about the delay, but I finally got around to making the changes I wanted to make here. I set up an init.py file and (after some learning) adjusted cmake to put it in the right place to fix the solver naming issue in the Python imports.
I was having random behavior breaking the example codes, so I made those work on my end. I think the main problem is that you can get undefined behavior in by calling NBody's setParameters()
before initialize()
since NBody's numberParticles
parameter won't be initialized, making the default setting for NperBatch
unsafe. Should that behavior get modified? To keep the parameters decoupled, we could check if the solver has been initialized within setParameters()
, but there is probably a better solution.
That is a good finding about setParameters. Please open another PR for that. In this one we should just get the CMake compilation working.
See here for an example of a CMakeLists that installs a python library https://github.com/openmm/NNPOps/blob/d15cb9196e283b6b55f88a93d85232458f64fa18/CMakeLists.txt#L103-L119 I think you did good, however. Please review the PR and lets merge it!
In this PR I update the build system in libMobility:
make install
.By installing libMobility to the conda environment all libraries (python and C++) and headers will be available without any user intervention as if they were other system libraries.
For instance, an user seeking to link NBody into his C++ code can just link with
-llibMobility_NBody
and include the headers as:without having to add any "-L" or "-I" directives.
Additionally, after calling
make install
, the user canimport libMobility
without extra steps.cc @brennansprinkle @rykerfish please check this out and lmk if you ran into any trouble with the latest install instructions.