michellab / Cluster

This repository is used for tracking any issues regarding the cluster
2 stars 0 forks source link

no opencl platform available to somd #5

Closed jmichel80 closed 7 years ago

jmichel80 commented 9 years ago

Test done on node006 julien@node006:~/projects/Thrombin/dataset00001/somd/3RML~3RMM/free/output/lam-0.00$ module list Currently Loaded Modulefiles: 1) openmm/6.2 2) cuda/6.5 julien@node006:~/projects/Thrombin/dataset00001/somd/3RML~3RMM/free/output/lam-0.00$ ~/sire.app/bin/somd-freenrg -C ../../input/freenrg.cfg -l 0.00 -p OpenCL (...) Running MD simulation

Cycle = 1

Using Integrator: "leapfrogverlet" Integration step = 0.002 ps Traceback (most recent call last): File "/home/julien/sire.app/share/Sire/scripts/somd-freenrg.py", line 140, in OpenMMMD.runFreeNrg(params) File "/home/julien/sire.app/bundled/lib/python3.3/site-packages/Sire/Tools/init.py", line 135, in inner retval = func() File "/home/julien/sire.app/bundled/lib/python3.3/site-packages/Sire/Tools/OpenMMMD.py", line 1263, in runFreeNrg system = moves.move(system, nmoves.val, True) RuntimeError: There is no registered Platform called "OpenCl"

Platforms CUDA, CPU and Reference work

Probable a build error of openmm julien@node006:~/projects/Thrombin/dataset00001/somd/3RML~3RMM/free/output/lam-0.00$ ll /home/common/openmm/lib/plugins/ total 3768 drwxrwxr-x 2 manager manager 4096 Jun 17 17:26 ./ drwxrwxr-x 3 manager manager 4096 Jun 17 17:26 ../ -rw-r--r-- 1 manager manager 776739 Jun 17 17:25 libOpenMMAmoebaCUDA.so -rw-r--r-- 1 manager manager 529314 Jun 17 17:25 libOpenMMAmoebaReference.so -rw-r--r-- 1 manager manager 471091 Jun 17 17:24 libOpenMMCPU.so -rw-r--r-- 1 manager manager 1725877 Jun 17 17:25 libOpenMMCUDA.so -rw-r--r-- 1 manager manager 95999 Jun 17 17:25 libOpenMMDrudeCUDA.so -rw-r--r-- 1 manager manager 63674 Jun 17 17:25 libOpenMMDrudeReference.so -rw-r--r-- 1 manager manager 115190 Jun 17 17:25 libOpenMMRPMDCUDA.so -rw-r--r-- 1 manager manager 53226 Jun 17 17:25 libOpenMMRPMDReference.so

Where do you keep the sources of the software installed in /home/common ?

It could make sense to have a /home/common/sources/

ppxasjsm commented 9 years ago

I didn't compile openmm with opencl support only CUDA.

I think the opencl libraries aren't installed on any of the nodes either as far as I know. I tought the default platform to use is CUDA and we wouldn't need OpenCL.

jmichel80 commented 9 years ago

In the past at least CUDA was slower or crashing when used with somd. This may no longer be the case in recent releases of OpenMM but we should have both platforms available. CUDA normally comes with a decent libopencl so it generally isn't too hard to enable both platforms.

I will add OpenCL support next week. Did you leave the source and build folders somewhere on the cluster?

ppxasjsm commented 9 years ago

Sorry I didn't realise that about CUDA/OpenCL. Yes. They are on node005 there is an openmm_build directory and openmm_6.2 source directory.

jmichel80 commented 9 years ago

I have issues compiling libOpenMMPME

Linking CXX executable ../../../../../TestCudaDrudeLangevinIntegrator [ 73%] Built target TestCudaDrudeLangevinIntegrator [ 73%] Building CXX object plugins/drude/platforms/cuda/tests/CMakeFiles/TestCudaDrudeSCFIntegrator.dir/TestCudaDrudeSCFIntegrator.cpp.o Linking CXX executable ../../../../../TestCudaDrudeSCFIntegrator [ 73%] Built target TestCudaDrudeSCFIntegrator [ 73%] Built target TestSerializeDrudeForce [ 73%] Built target TestSerializeDrudeLangevinIntegrator [ 73%] Built target OpenMMPME Linking CXX executable ../../../TestCpuPme ../../../libOpenMMPME.so: undefined reference to fftwf_free' ../../../libOpenMMPME.so: undefined reference tofftwf_malloc' ../../../libOpenMMPME.so: undefined reference to fftwf_plan_with_nthreads' ../../../libOpenMMPME.so: undefined reference tofftwf_plan_dft_r2c_3d' ../../../libOpenMMPME.so: undefined reference to fftwf_execute_dft_r2c' ../../../libOpenMMPME.so: undefined reference tofftwf_init_threads' ../../../libOpenMMPME.so: undefined reference to fftwf_execute_dft_c2r' ../../../libOpenMMPME.so: undefined reference tofftwf_plan_dft_c2r_3d' ../../../libOpenMMPME.so: undefined reference to `fftwf_destroy_plan' collect2: error: ld returned 1 exit status make[2]: * [TestCpuPme] Error 1 make[1]: * [plugins/cpupme/tests/CMakeFiles/TestCpuPme.dir/all] Error 2 make: *\ [all] Error 2

Setting OPENMM_BUILD_PME_PLUGIN OFF solves it but with limited functionality

However I struggle to make OpenCL work manager@node005:~/openmm_build$ ./TestOpenCLSort exception: Error initializing context: clGetPlatformIDs (-1001)

I am also confused since openmm has been installed in /usr/local/python. So I did that again just in case

sudo make PythonInstall

And next

manager@node005:~/openmm_build$ python /usr/local/lib/python2.7/dist-packages/simtk/testInstallation.py There are 3 Platforms available:

1 Reference - Successfully computed forces 2 OpenCL - Error computing forces with OpenCL platform 3 CPU - Successfully computed forces

OpenCL platform error: Error initializing context: clGetPlatformIDs (-1001)

Median difference in forces between platforms:

Reference vs. CPU: 3.43117e-06

So there is still a problem with OpenCL at the moment.

ppxasjsm commented 9 years ago

Did you install some opencl libraries? They might not be installed by default, since I used a CUDA runscript to install CUDA and not the packaged debian drivers, which would also automatically install openCL libraries. The python install is wrong. I may have done this by accident and should also be installed in /home/common otherwise we end up with a mess of python libraries distributed everywhere probably resulting in possibly uncrackable import errors. I recognise the fftw error. I think I saw something like this, but somehow managed to solve it. I'll have to have a look in my notes.

cass00 commented 7 years ago

Did you find out how to solve the fftw error?

cass00 commented 7 years ago

If anyone ends up here: The fftw package has be build with multi-thread support to solve this problem.