RowleyGroup / NNP-MM

Scripts to interface TorchANI NNP with NAMD
30 stars 6 forks source link

Getting execution error #1

Open aaayushg opened 4 years ago

aaayushg commented 4 years ago

Trying to run the example calculations 'erlotinib-water' but getting the following error.

sh: 1: /home/aayush/Documents/NNP-MM/client.py: Permission denied ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: Error running command for QM forces calculation.

Charm++ fatal error: FATAL ERROR: Error running command for QM forces calculation.

I have corrected all the paths and have 'rw' permission to each file/folder.

cnrowley commented 4 years ago

Please make the script executable:

chmod u+x /home/aayush/Documents/NNP-MM/client.py

On Mon, Apr 6, 2020 at 6:20 PM Aayush Gupta notifications@github.com wrote:

Trying to run the example calculations 'erlotinib-water' but getting the following error.

sh: 1: /home/aayush/Documents/NNP-MM/client.py: Permission denied ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: Error running command for QM forces calculation.

Charm++ fatal error: FATAL ERROR: Error running command for QM forces calculation.

I have corrected all the paths and have 'rw' permission to each file/folder.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/RowleyGroup/NNP-MM/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYZVJD3FIZVFTV72ITAL3TRLI6CZANCNFSM4MCS226A .

aaayushg commented 4 years ago

That solves the error. Thanks ! But I'm getting another error which I'm not sure what's missing.

rm: cannot remove '/dev/shm/0/qmmm_0.input.result': No such file or directory Traceback (most recent call last): File "/home/aayush/Documents/NNP-MM/client.py", line 12, in client.connect(sock_address) ConnectionRefusedError: [Errno 111] Connection refused ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: Error running command for QM forces calculation.

Charm++ fatal error: FATAL ERROR: Error running command for QM forces calculation.

cnrowley commented 4 years ago

Did you do these commands?

export RUNDIR=/dev/shm/$SLURM_JOB_ID/ mkdir $RUNDIR

This is to set the temporary directory to some place that exists and is writable. It could be easier to just change this line in your NAMD input file

qmBaseDir /tmp

On Mon, Apr 6, 2020 at 10:28 PM Aayush Gupta notifications@github.com wrote:

That solves the error. Thanks ! But I'm getting another error which I'm not sure what's missing.

rm: cannot remove '/dev/shm/0/qmmm_0.input.result': No such file or directory Traceback (most recent call last): File "/home/aayush/Documents/NNP-MM/client.py", line 12, in client.connect(sock_address) ConnectionRefusedError: [Errno 111] Connection refused ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: Error running command for QM forces calculation.

Charm++ fatal error: FATAL ERROR: Error running command for QM forces calculation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/RowleyGroup/NNP-MM/issues/1#issuecomment-610113503, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYZVJA7SNNTAH342PUD7U3RLJ3CTANCNFSM4MCS226A .

aaayushg commented 4 years ago

Yes, I am using all the files from github and trying to run a sample calculations (with the files in example). My run_nnp.sh file looks like (script from README):

export RUNDIR=/dev/shm/$SLURM_JOB_ID/ mkdir $RUNDIR

module load nixpkgs/16.09 intel/2016.4

module load namd-multicore/2.13

module load python/3.7.0

python3 /home/aayush/Documents/NNP-MM/server.py >& /home/aayush/Documents/NNP-MM/server.$SLURM_JOBID.log& sleep 10 # wait to make sure server has initialized first namd2 +p7 md.conf > namd_out.log

Also, everything's same in md.conf (from example folder)

qmforces on qmParamPDB qmmm.pdb qmSoftware custom XXX change this to absolute path of socket client script qmexecpath /home/aayush/Documents/NNP-MM/client.py this should be writeable absolute path on a fast file system (e.g., a RAM disk) qmBaseDir /dev/shm/ QMColumn occ qmChargeMode none qmElecEmbed off

cnrowley commented 4 years ago

The commands to set the temporary directory aren't working. Possibly, this is because your queuing system is not setting the SLURM_JOBID environment variable (are you running this through a SLURM queuing system?). Alternatively, this could be because the /dev/shm directory is not available or writable by you on this system. It is impossible for me to know this. I would recommend talking to your system administrator for assistance. It is likely that /tmp will exist on all systems, so the simplest solution is to follow my previous suggestion of changing this line in the NAMD input file to direct to /tmp instead:

qmBaseDir /tmp

On Mon, Apr 6, 2020 at 11:02 PM Aayush Gupta notifications@github.com wrote:

Yes, I am using all the files from github and trying to run a sample calculations (with the files in example). My run_nnp.sh file looks like (script from README):

export RUNDIR=/dev/shm/$SLURM_JOB_ID/ mkdir $RUNDIR

module load nixpkgs/16.09 intel/2016.4

module load namd-multicore/2.13

module load python/3.7.0

python3 /home/aayush/Documents/NNP-MM/server.py >& /home/aayush/Documents/NNP-MM/server.$SLURM_JOBID.log& sleep 10 # wait to make sure server has initialized first namd2 +p7 md.conf > namd_out.log

Also, everything's same in md.conf (from example folder)

qmforces on qmParamPDB qmmm.pdb qmSoftware custom

XXX change this to absolute path of socket client script qmexecpath /home/aayush/Documents/NNP-MM/client.py this should be writeable absolute path on a fast file system (e.g., a RAM disk) qmBaseDir /dev/shm/ QMColumn occ qmChargeMode none qmElecEmbed off

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/RowleyGroup/NNP-MM/issues/1#issuecomment-610122833, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYZVJHWSL42KERLJST2BDTRLJ7C7ANCNFSM4MCS226A .

aaayushg commented 4 years ago

Thanks for your prompt replies. I'm not running it using slurm queue, running it on local machine. I have now changed the qmBaseDir to /tmp but still getting the error.

Traceback (most recent call last): File "/home/aayush/Documents/NNP-MM/client.py", line 13, in client.connect(sock_address) ConnectionRefusedError: [Errno 111] Connection refused ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: Error running command for QM forces calculation.

Charm++ fatal error: FATAL ERROR: Error running command for QM forces calculation.

cnrowley commented 4 years ago

Did you start the server daemon before running this command? See the line regarding server.py. Start it. Put in it in the background. Check to see if it is running with ps, then you can run NAMD.

You will need to change all the lines involving SLURM because these will only be sensible on a SLURM queuing system.

On Mon, Apr 6, 2020 at 11:17 PM Aayush Gupta notifications@github.com wrote:

Thanks for your prompt replies. I'm not running it using slurm queue, running it on local machine. I have now changed the qmBaseDir to /tmp but still getting the error.

Traceback (most recent call last): File "/home/aayush/Documents/NNP-MM/client.py", line 13, in client.connect(sock_address) ConnectionRefusedError: [Errno 111] Connection refused ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: Error running command for QM forces calculation.

Charm++ fatal error: FATAL ERROR: Error running command for QM forces calculation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/RowleyGroup/NNP-MM/issues/1#issuecomment-610126581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYZVJCUMHDWEKX3MJ3QPHDRLKA3BANCNFSM4MCS226A .

aaayushg commented 4 years ago

Yes I start the server.py before running namd and it runs without error. I get an error when I execute namd2

ERROR: Could not find QM output file! FATAL ERROR: (unknown error): No such file or directory

What files/lines are having slurm dependency ?

cnrowley commented 4 years ago

The lines of the script that contain $SLURM_JOB_ID need to be changed. These variables are defined by SLURM queuing systems when a job is executed. You need to change them to appropriate values on your local system.

export RUNDIR=/dev/shm/$SLURM_JOB_ID/mkdir $RUNDIR

module load nixpkgs/16.09 intel/2016.4

module load namd-multicore/2.13

module load python/3.7.0

python3 /home/aayush/Documents/NNP-MM/server.py >& /home/aayush/Documents/NNP-MM/server.$SLURM_JOBID.log&

As for your new error message; I can't diagnose this problem remotely. I believe your system does not have the correct path, environment variables, or python libraries correctly configured. You should talk to your system administrator for assistance.

On Mon, Apr 6, 2020 at 11:35 PM Aayush Gupta notifications@github.com wrote:

Yes I start the server.py before running namd and it runs without error. I get an error when I execute namd2

ERROR: Could not find QM output file! FATAL ERROR: (unknown error): No such file or directory

What files/lines are having slurm dependency ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/RowleyGroup/NNP-MM/issues/1#issuecomment-610131365, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYZVJDBXOS3BHUZYXRS4ADRLKC6BANCNFSM4MCS226A .

aaayushg commented 4 years ago

Could you suggest what path/env I need to setup for this particular execution ? Thanks