Open DM0815 opened 1 year ago
Please provide the "above exception" mentioned in your error message. Thanks.
Please provide the "above exception" mentioned in your error message. Thanks.
I'm sorry for replying late.
Hi I also encounter the same problem, Do you have solved the error?
Hi I also encounter the same problem, Do you have solved the error?
At this time, we don't have access to any LSF node. If you have found what is wrong with dpdispatcher, feel free to contribute to the code.
When I use lsf queue system to conduct dpgen in logining node of server cluster.After submitting the command,it reminds "RuntimeError: Meet errors will handle unexpected submission state." and suggest me to see the remote_root.But there are no mistake information in work dir. And in dp task dir, the jobs is still runing, the train.log is ok. And I can the jobs in queue system. I don't know where wrong, can you give me some hints. machine.jsons and mistake informarion attached.
machine.json: { "api_version": "1.0", "_deepmd_version": "2.1.0", "train" : { "command": "dp", "machine": { "batch_type": "LSF", "context_type": "local", "local_root" : "./", "remote_root":"/public/home/dmeng/DPGEN/0316testlsf/tmp" }, "resources": { "number_node": 1, "cpu_per_node": 8, "gpu_per_node": 0, "queue_name":"normal", "group_size": 2, "_batch_type": "LSF", "_kwargs": {}, "source_list":["/public/home/dmeng/anaconda3/bin/activate deepmd"] } }, "model_devi": { "command": "lmp -i input.lammps -v restart 0", "machine": { "batch_type": "LSF", "context_type": "local", "local_root" : "./", "remote_root":"/public/home/dmeng/DPGEN/0316testlsf/tmp"
"fp": { "command": "ulimit -s unlimited && mpirun -n 8 /public/home/dmeng/softwares/vasp.5.4/bin/vasp_std", "machine": { "batch_type": "LSF", "context_type": "local", "local_root" : "./", "remote_root":"/public/home/dmeng/DPGEN/0316testlsf/tmp" }, "resources": { "number_node": 1, "cpu_per_node": 8, "gpu_per_node": 0, "queue_name":"normal", "group_size": 50, "_batch_type": "LSF", "_kwargs": {}, "source_list": ["/public/softwares/intel/oneapi/setvars.sh"] } } }