Closed kexul closed 3 years ago
Looking at the OpenMM packages available on the Omnia channel here it doesn't look like there is a version built against CUDA 11. I think only the conda-forge version of OpenMM supports more recent CUDA versions. It is possible to build against this version, but it takes a little work. (We have a conda-forge recipe for Sire here which can be used to build a local conda package against OpenMM 7.5.1.)
I'll try to work out why the optimise_openmm
script is failing to give you a sensible error message.
Hi @lohedges , I looked the Omnia channel you gave and tried install Sire in a cuda 10.1 environment. Now the output of optimise_openmm is:
Starting optimise_openmm: number of threads equals 10
CUDA platform is not recognised by OpenMM!
available platforms are:
['Reference', 'CPU']
Let's see if we can do something about this....
Found a CUDA toolkit release version: 10.1
Trying to update OpenMM to match your CUDA version 10.1 for your OpenMM version 7.4.2
This may take a little while. Please hold tight!
................................................
==============================================================
Sending anonymous Sire usage statistics to http://siremol.org.
For more information, see http://siremol.org/analytics
To disable, set the environment variable 'SIRE_DONT_PHONEHOME' to 1
To see the information sent, set the environment variable
SIRE_VERBOSE_PHONEHOME equal to 1. To silence this message, set
the environment variable SIRE_SILENT_PHONEHOME to 1.
==============================================================
['Reference', 'CPU', 'CPU']
Something didn't work out with the update of OpenMM via conda, have a look at the output.
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: /root/miniconda3/envs/biosimspace
added / updated specs:
- openmm=7.4.2
The following packages will be downloaded:
package | build
---------------------------|-----------------
openmm-7.4.2 |py37_cuda101_rc_1 11.9 MB omnia/label/cuda101
------------------------------------------------------------
Total: 11.9 MB
The following packages will be SUPERSEDED by a higher-priority channel:
openmm omnia --> omnia/label/cuda101
Downloading and Extracting Packages
openmm-7.4.2 | 11.9 MB | ########## | 100%
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Still not able to run simulation though.
ps: I'm doing all these stuff in a docker container, its host machine which has the same version installed runs fine.
Ah, I could not get the simulation running with the official docker image biosimspace/biosimspace-devel:latest
, does it have CUDA
support?
Ah, I see. The docker image is built as part of our Azure CI pipeline on a VM without a GPU. This shouldn't matter though, since the available OpenMM platforms are detected at run-time.
Just to check: How did you install BioSImSpace on the host? Using the conda-package? When you say "same version installed", do you mean the host and Docker container are using the same version of OpenMM? (Same OpenMM version and same CUDA driver build.) I assume that the host has CUDA drivers installed, but the Docker container doesn't.
How did you install BioSImSpace on the host? Using the conda-package?
Yes, I installed BioSimSpace on host by conda.
When you say "same version installed", do you mean the host and Docker container are using the same version of OpenMM?
Yes, the same version of OpenMM
, which is
# Name Version Build Channel
openmm 7.4.2 py37_cuda101_rc_1 omnia
My host have cuda11, I've tested cuda10.1 and cuda11 in docker container, both failed.
I assume that the host has CUDA drivers installed, but the Docker container doesn't.
I'm afraid that not the case, I've installed other GPU powered packages in the container such as pytorch, tensorflow, etc... they can harness the GPU well. Besides, running the following example code of OpenMM with GPU acceleration was fine in the container:
from simtk.openmm.app import *
from simtk.openmm import *
from simtk.unit import *
from sys import stdout
pdb = PDBFile('input.pdb')
forcefield = ForceField('amber14-all.xml', 'amber14/tip3pfb.xml')
system = forcefield.createSystem(pdb.topology, nonbondedMethod=PME,
nonbondedCutoff=1*nanometer, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.004*picoseconds)
platform = Platform.getPlatformByName('CUDA')
simulation = Simulation(pdb.topology, system, integrator, platform)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()
simulation.reporters.append(PDBReporter('output.pdb', 1000))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
potentialEnergy=True, temperature=True))
simulation.step(10000)
Are you just seeing the following error:
RuntimeError: There is no registered Platform called "CUDA"
If so, could you try manually setting OPENMM_PLUGIN_DIR
before running your SOMD command. It should set the correct path for you, but perhaps it's not working in the Docker image. I imagine that you'd need to do something like:
export OPENMM_PLUGIN_DIR=/home/sireuser/sire.app/lib/plugins
I don't know too much about using Docker containers with CUDA, i.e. I'm not sure if you need to do something clever for the drivers on the host to be visible to the container. However, the fact that running OpenMM using CUDA outside of Sire works suggests that they are indeed being picked up.
When updating things inside of the container, have you been using the installed Sire MiniConda, i.e. sire.app
? You would be using commands like:
~/sire.app/bin/conda install ...
Hi @lohedges , thanks for your continued help!
I've noticed this special environment variable OPENMM_PLUGIN_DIR
in the documentation, but did not quite understand the meaning of it.
I installed sire via mamba in miniconda: mamba create -n biosimspace -c conda-forge -c omnia -c michellab biosimspace
. The location of Sire
in my environment is /root/miniconda3/envs/biosimspace/lib/python3.7/site-packages/Sire
, I've navigated to the folder but no plugins
in it. Here is the content of the folder:
Analysis Base CAS Cluster Config Error FF ID IO MM Maths
Mol Move Qt Squire Stream System Tools Units Vol __init__.py __pycache__
I'm confused, you are using mamba
to install BioSimSpace within the BioSimSpace Docker image? The OpenMM libraries are not Python libraries, so you need to look in /root/miniconda3/envs/biosimspace/lib/plugins
.
I'm confused, you are using mamba to install BioSimSpace within the BioSimSpace Docker image?
Nope, just installed biosimspace in a clean centos docker with cuda enabled.
so you need to look in /root/miniconda3/envs/biosimspace/lib/plugins
/root/miniconda3/envs/biosimspace/lib/plugins
seems to be the right path, but setting OPENMM_PLUGIN_DIR
did not work 😔
And just to confirm, the following works fine when run in the active biosimspace
conda environment:
# test.py
from simtk.openmm import *
platform = Platform.getPlatformByName("CUDA")
python test.py
(You've suggested this earlier, but just want to confirm that we're using the exact same environment.)
If so, the only difference is that the Python version is using the OpenMM Python API, whereas the somd-freenrg
code is calling into Sire, which is using the C++ API. Assuming they are using the same version of libOpenMM
, and the CUDA driver is the same, then there shouldn't be a difference, unless somehow the way the C++ API works means that it's unable to see the drivers on the host.
And just to confirm, the following works fine when run in the active biosimspace conda environment:
Yes, it works fine, no error, no warning.
it's unable to see the drivers on the host
I have the same opinion, it could be some problems with my hardware or docker configuration, I'll dig into it and report back if I get further information. Anyway, thanks for your help so far. much appreciated!
Managed to get it working in a clean ubuntu docker image, maybe my centos image is broken...
Thanks for the update. Glad to hear that you got things working. I still find it strange that OpenMM worked directly, but not via SOMD. It would be good to know if there was something different in the Docker setup (other than CentOS vs Ubuntu) so that we could document the issue for other users.
Cheers.
I'll post my finding here if there is any update.
Hi @lohedges , I've tested several images pulled from nvidia's official docker hub(including ubuntu, centos, cuda10, cuda10.1...), all of them runs fine after optimise_openmm
, which shows that somd-freenrg
is quite robust, and I believe most of users might not meet this problem if their cuda is setting up correctly.
In fact, I was using a linux distribution based on centos but with some modification in kernel and docker, which should most probably be blamed for.
Thanks for the update, that's really helpful. I'm also pleased to hear that things seem to be quite reliable. I'll update the docs regarding our Docker image so that users know that it won't work with CUDA. We never really intended it to be used in this way, rather it's a minimal base environment with an old glibc which we use for our CI and to build our manylinux binary installer.
Hi, I've installed Sire by
Then created some perturbated system by
BioSimSpace
and run simulation byIt complained:
I've looked
OpenMM
's documentation and used its self-test command:which showed:
When
optimise_openmm
was used, it showed:I'm using centos 7.2 with cuda11, here is the output of
nvcc --version
: