Closed Joey-zhangcy closed 2 years ago
The model_devi
file should be generated from the second step.
Dear njzjz: Much thanks to your kind reply. It is my negligence not to elaborate the problem carefully. I perfrom the dpgen run rather than dp train input.json. The model_devi file should be generated. Here is my param.json file. Could you please give me some advice? I would be much appreciated.
Thanks a lot.
All the best, Joey param.zip
Dear njzjz: Much thanks to your kind reply. It is my negligence not to elaborate the problem carefully. I perfrom the dpgen run rather than dp train input.json. The model_devi file should be generated. Here is my param.json file. Could you please give me some advice? I would be much appreciated.
Thanks a lot.
All the best, Joey param.zip
"but there is no such model_devi file. " this is because the definition of "sys_configs" is not correct. you should change the json file as this: "sys_configs_prefix": "/public/home/zhangchengyi/lammps-practice/tutorials/tutorials-master/EXAMPLES/dpgen_cloudserver/CH4.POSCAR.01x01x01/01.scale_pert/sys-0004-0001/scale-1.000/", "sys_configs": [ [ "00000/POSCAR" ], [ "00001/POSCAR" ] ],
Dear taipinghu: Much thanks for your advice, but it seems that such change doesn't work. I still appreciate for your help.
Thanks a lot.
All the best, Joey
Dear taipinghu: Much thanks for your advice, but it seems that such change doesn't work. I still appreciate for your help.
Thanks a lot.
All the best, Joey
I think uploading all input files is more efficient to fix your error.
Dear taipinghu: Much thanks for your help, here are all my input files while all the POSCARs are obtained from the examples in the dpgen. I would appreciate it if you could take the time to check it out!
Thanks a lot.
All the best, Joey all.zip
Dear taipinghu: Much thanks for your help, here are all my input files while all the POSCARs are obtained from the examples in the dpgen. I would appreciate it if you could take the time to check it out!
Thanks a lot.
All the best, Joey all.zip
what is the error in dpgen run ?
Dear taipinghu: Here is my output files ,it seems that i cann't upload the filefoler. I set the training steps as 2000 in the param.json. The first step training seems to finish by the lcurve.out, then it stopped. Thanks a lot.
All the best, Joey output.zip
Dear taipinghu: Here is my output files ,it seems that i cann't upload the filefoler. I set the training steps as 2000 in the param.json. The first step training seems to finish by the lcurve.out, then it stopped. Thanks a lot.
All the best, Joey output.zip
the error is caused by dpdispathcer? you can change to the work dir and check if it works normmally.
Dear taipinghu: The dpdispatcher seems to be generated by the dpgen. I followed your advice to move the file to another folder but dpgen still doesn't work. Thanks a lot.
All the best, Joey
Dear taipinghu: I found an interesting thing. i retype the dpgen run param.json machine.json in the terminal. The code rerun and the model_devi appear but err shows. Could you give me some advice? err.txt Thanks a lot.
All the best, Joey
Dear taipinghu: I found an interesting thing. i retype the dpgen run param.json machine.json in the terminal. The code rerun and the model_devi appear but err shows. Could you give me some advice? err.txt Thanks a lot.
All the best, Joey
I think first you should check if the path (sys_configs_prefix and sys_configs in parameter json file) is correct. You can goto 01.model_devi dir to see if some dirs like task.000.00000 is exist.
Dear taipinghu: I went through the path(sys_configs in parameter json file) by the cd command. Nothing went wrong. There are just four .pb file, a cur_job.json, and an empty filefolder confs in the 01.model_devi filefolder. No such file named task.000.0000. Thanks a lot.
All the best, Joey
Dear taipinghu: I went through the path(sys_configs in parameter json file) by the cd command. Nothing went wrong. There are just four .pb file, a cur_job.json, and an empty filefolder confs in the 01.model_devi filefolder. No such file named task.000.0000. Thanks a lot.
All the best, Joey
please check carefully again, I still think the path of sys_configs is incorrect. you can manually write a simple script to read param.json file and then print the sys_figs.
Dear taipinghu: Thanks for your advice.while there is a data.init filefolder generated automatically in the iter.000000. All the sys_configs files are listed inside. I thought if the path of sys_configs is incorrect and the dpgen cannot find these file. These sys_configs files couldn't be listed here?
Thanks a lot.
All the best, Joey
Dear taipinghu: Thanks for your advice.while there is a data.init filefolder generated automatically in the iter.000000. All the sys_configs files are listed inside. I thought if the path of sys_configs is incorrect and the dpgen cannot find these file. These sys_configs files couldn't be listed here?
Thanks a lot.
All the best, Joey
(1) data.init filefolder is originated from init_data_prefix and init_data_sys in param, rather than sys_configs_prefix and sys_configs. (2) as mentioned above, you find an empty folder in confs in 01.model_devi. This confs dirs save the lammps lmp format files, which are converted from POSCAR stated in os.path.join(sys_configs_prefix, sys_configs).
Dear taipinghu: I really appreciate your help. No matter how i change the path style of the original file, it didn't work. I download the input-sys_configs from Internet, and change the path, it works. By the way, can I ask you one more question, that is, every time I run nohup dpgen run param.json machine.json, it stops after one step, and I need to retype the command in the terminal before I can run the next step, do you have any idea about this problem? Thanks a lot.
All the best, Joey
Dear taipinghu: I really appreciate your help. No matter how i change the path style of the original file, it didn't work. I download the input-sys_configs from Internet, and change the path, it works. By the way, can I ask you one more question, that is, every time I run nohup dpgen run param.json machine.json, it stops after one step, and I need to retype the command in the terminal before I can run the next step, do you have any idea about this problem? Thanks a lot.
All the best, Joey
as for your first question, you shouled know that the workflow of dpgen contains three steps, i.e. 00.train, 01.model_devi, 02.fp. each step also contains three steps, e.g., make_train, run_train and post train. you can read the record.dpgen file to get the current step. It will be helpful for you to fix the error.
dpgen can automatically run above steps, unless you write a incorrect machine.json file (depend on your schedule system).
Dear taipinghu: Thank you very much for your help, I will adjust the parameters carefully。 Thanks a lot.
All the best, Joey
It seems that this problem has been solved, so I'll close this issue. If you have any questions yet, you can reopen this issue or create a new issue.
Dear DP users, i'm new to deep modeling. When i followed the CH4 case on the website, i met some problems, when the key words load_cpkt in my param.json, the dp train will stop and report undefined key
**load_ckpt
is not allowed in strict mode**" when i delete the load_ckpt and restart the training, the first step-dp will finished, but there is no such model_devi file. i guess the lammps can't load the potential function from the first dp train step. but i can't figure it out? could some one help?Any suggestions or comments will be much appreciated.
Thanks a lot.
All the best, Joey train (2).log