Closed aneesmkp closed 4 years ago
Yes this is a known issue. Cuda 10.0 works fine on RTX card if you can compile tinker-openmm with it.
From: aneesmkp notifications@github.com Sent: Monday, March 9, 2020 2:52 PM To: TinkerTools/Tinker Tinker@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [TinkerTools/Tinker] Tinker-openMM GPU: Segmentation Fault (Core Dumped) (#55)
I am getting Segmentation Fault (core Dumped) while trying to run the dynamic_omm . The input files are working fine with the CPU version of dynamic. I have followed Lee-Ping Wang's notes while compiling. I am using CUDA-10.2 on RTX-2070 (Ubuntu 18.04). I also tried to run dynamic with just a water box (https://biomol.bme.utexas.edu/~pren/downloads/waterbox) and still i am getting the same error.
ERROR OBTAINED FOR THE SIMULATION of WATER BOX: anees@basin:~/work/tinker_test/amoebanuc17_solv_noions$ /home/anees/src/Tinker/build_openmm/dynamic_omm.x water36.xyz
######################################################################
##########################################################################
Tinker --- Software Tools for Molecular Design Version 8.7 June 2019 Copyright (c) Jay William Ponder 1990-2019 All Rights Reserved
########################################################################## ######################################################################
Enter the Number of Dynamics Steps to be Taken : 1000
Enter the Time Step Length in Femtoseconds [1.0] : 2.0
Enter Time between saves in Picoseconds [0.1] : 0.1
Available Statistical Mechanical Ensembles :
(1) Microcanonical (NVE)
(2) Canonical (NVT)
(3) Isoenthalpic-Isobaric (NPH)
(4) Isothermal-Isobaric (NPT)
Enter the Number of the Desired Choice [1] : 2
Enter the Desired Temperature in Degrees K [298] : 300
Return Data from the GPU at Every Time Step [N] : 1000
Number of CUDA Devices Detected : 1
Device Number : 0 Device Name GeForce RTX 2070 Clockspeed (GHz) 1.620 Total Memory (GB) 8.00 Free Memory (GB) 6.77 GPU load 1.00%
Platform CUDA : Setting Precision to MIXED via CUDA-PRECISION
Molecular Dynamics Trajectory via r-RESPA MTS Algorithm terminate called after throwing an instance of 'OpenMM::OpenMMException' what(): Error loading CUDA module: CUDA error (218)
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
Aborted (core dumped)
ERROR OBTAINED FOR THE SIMULATION I WAS TRYING: anees@basin:~/work/tinker_test/amoebanuc17_solv_noions$ /home/anees/src/Tinker/build_openmm/dynamic_omm.x scl_Na_solv_output.xyz
######################################################################
##########################################################################
Tinker --- Software Tools for Molecular Design Version 8.7 June 2019 Copyright (c) Jay William Ponder 1990-2019 All Rights Reserved
########################################################################## ######################################################################
Enter Potential Parameter File Name : amoebanuc17.prm
Enter the Number of Dynamics Steps to be Taken : 100
Enter the Time Step Length in Femtoseconds [1.0] : 2.0
Enter Time between saves in Picoseconds [0.1] : 0.01
Available Simulation Control Modes :
(1) Constant Total Energy Value (E)
(2) Constant Temperature via Thermostat (T)
Enter the Number of the Desired Choice [1] : 2
Enter the Desired Temperature in Degrees K [298] : 298
Return Data from the GPU at Every Time Step [N] : 10
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Segmentation fault (core dumped)
I have tried the suggestion that came up here: #52https://github.com/TinkerTools/Tinker/issues/52. Still its not working.
-Anees
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/TinkerTools/Tinker/issues/55?email_source=notifications&email_token=ABNC6XQ6UO4RL3XPIBOS7LTRGVJIDA5CNFSM4LEQGVU2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ITV424Q, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABNC6XX2RIBG6USWQURVIPLRGVJIDANCNFSM4LEQGVUQ.
This message is from an external sender. Learn more about why this matters.https://ut.service-now.com/sp?id=kb_article&number=KB0011401
Actually, I have this working correctly on a very similar combination to the original poster: Ubuntu 18.04, CUDA 10.2, an RTX 2070 MaxQ (mobile version of 2070), and the current Tinker and Tinker-OpenMM from here on the TinkerTools Github site. Though as Pengyu notes, I think various people have had trouble with CUDA 10.2 and RTX cards. As he suggests, you should try CUDA 10.0. Also, make sure to use the keyword "integrator RESPA" or "integrator VERLET", as not all integrators that work in the CPU code are supported in Tinker-OpenMM. In particular, the "default" BEEMAN integrator from the CPU code is not supported in Tinker-OpenMM. For AMOEBA, we generally recommend using the RESPA integrator with a 2 fs time step.
I'm going to close this issue. If there are still problems under this topic, please post to the issue to the Tinker-OpenMM package.
I am getting Segmentation Fault (core Dumped) while trying to run the dynamic_omm . The input files are working fine with the CPU version of dynamic. I have followed Lee-Ping Wang's notes while compiling. I am using CUDA-10.2 on RTX-2070 (Ubuntu 18.04). I also tried to run dynamic with just a water box (https://biomol.bme.utexas.edu/~pren/downloads/waterbox) and still i am getting the same error.
ERROR OBTAINED FOR THE SIMULATION of WATER BOX: anees@basin:~/work/tinker_test/amoebanuc17_solv_noions$ /home/anees/src/Tinker/build_openmm/dynamic_omm.x water36.xyz
##########################################################################
Tinker --- Software Tools for Molecular Design
Version 8.7 June 2019
Copyright (c) Jay William Ponder 1990-2019
All Rights Reserved
########################################################################## ######################################################################
Enter the Number of Dynamics Steps to be Taken : 1000
Enter the Time Step Length in Femtoseconds [1.0] : 2.0
Enter Time between saves in Picoseconds [0.1] : 0.1
Available Statistical Mechanical Ensembles :
Enter the Number of the Desired Choice [1] : 2
Enter the Desired Temperature in Degrees K [298] : 300
Return Data from the GPU at Every Time Step [N] : 1000
Number of CUDA Devices Detected : 1
Device Number : 0 Device Name GeForce RTX 2070 Clockspeed (GHz) 1.620 Total Memory (GB) 8.00 Free Memory (GB) 6.77 GPU load 1.00%
Platform CUDA : Setting Precision to MIXED via CUDA-PRECISION
Molecular Dynamics Trajectory via r-RESPA MTS Algorithm terminate called after throwing an instance of 'OpenMM::OpenMMException' what(): Error loading CUDA module: CUDA error (218)
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
0 0x7f6e51c5231a
1 0x7f6e51c51503
2 0x7f6e51098f1f
3 0x7f6e51098e97
4 0x7f6e5109a800
5 0x7f6e53b2e956
6 0x7f6e53b34ab5
7 0x7f6e53b34af0
8 0x7f6e53b34d78
9 0x7f6e184992e2
10 0x7f6e1855765b
11 0x7f6e18557f77
12 0x7f6e184e9cfd
13 0x7f6e54165483
14 0x7f6e18502094
15 0x7f6e541a519f
16 0x55d2aefc248b
17 0x55d2aefc135e
18 0x7f6e5107bb96
19 0x55d2aefc13e9
20 0xffffffffffffffff
Aborted (core dumped)
ERROR OBTAINED FOR THE SIMULATION I WAS TRYING: anees@basin:~/work/tinker_test/amoebanuc17_solv_noions$ /home/anees/src/Tinker/build_openmm/dynamic_omm.x scl_Na_solv_output.xyz
##########################################################################
Tinker --- Software Tools for Molecular Design
Version 8.7 June 2019
Copyright (c) Jay William Ponder 1990-2019
All Rights Reserved
########################################################################## ######################################################################
Enter Potential Parameter File Name : amoebanuc17.prm
Enter the Number of Dynamics Steps to be Taken : 100
Enter the Time Step Length in Femtoseconds [1.0] : 2.0
Enter Time between saves in Picoseconds [0.1] : 0.01
Available Simulation Control Modes :
Enter the Number of the Desired Choice [1] : 2
Enter the Desired Temperature in Degrees K [298] : 298
Return Data from the GPU at Every Time Step [N] : 10
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
0 0x7f322a5b531a
1 0x7f322a5b4503
2 0x7f32299fbf1f
3 0x7f322cae9082
4 0x7f322cc9190b
5 0x56072cc3e46b
6 0x56072cc45a5d
7 0x56072cc3bf8d
8 0x56072cc3b35e
9 0x7f32299deb96
10 0x56072cc3b3e9
11 0xffffffffffffffff
Segmentation fault (core dumped)
I have tried the suggestion that came up here: https://github.com/TinkerTools/Tinker/issues/52. Still its not working.
-Anees