deepmodeling / deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics
https://docs.deepmodeling.com/projects/deepmd/
GNU Lesser General Public License v3.0
1.45k stars 499 forks source link

Error while dp train input.json #4126

Open umang4002 opened 4 days ago

umang4002 commented 4 days ago

Summary

I constructes folders as type.raw, type_map.raw, set.000 - [box.npy, force.npy, energy.npy, coord.npy] { "model": { "type_map": [ "W", "Fe", "Ni", "Co" ], "descriptor": { "type": "se_e2_a", "rcut": 6.0, "rcut_smth": 0.5, "sel": [ 40, 40, 40, 40 ], "neuron": [ 10, 20, 40 ], "resnet_dt": false, "axis_neuron": 4, "seed": 1, "_comment": "that's all" }, "fitting_net": { "neuron": [ 100, 100, 100 ], "resnet_dt": true, "seed": 1, "_comment": "that's all" }, "_comment": "that's all" }, "learning_rate": { "type": "exp", "decay_steps": 5000, "start_lr": 0.001, "stop_lr": 3.51e-08, "_comment": "that's all" }, "loss": { "type": "ener", "start_pref_e": 0.02, "limit_pref_e": 1, "start_pref_f": 1000, "limit_pref_f": 1, "start_pref_v": 0, "limit_pref_v": 0, "_comment": "that's all" }, "training": { "training_data": { "systems": [ "train_data/set_1/", "train_data/set_2/", "train_data/set_3/" ], "batch_size": "auto", "_comment": "that's all" }, "validation_data": { "systems": [ "test_data/set_1/", "test_data/set_2/", "test_data/set_3/" ], "batch_size": "auto", "numb_btch": 1, "_comment": "that's all" }, "numb_steps": 100000, "seed": 10, "disp_file": "lcurve.out", "disp_freq": 1000, "save_freq": 10000 } } This is the input.json I am using.

DeePMD-kit Version

DeePMD-kit v2.2.9

Backend and its version

Tensorflow 2.9.0

Python Version, CUDA Version, GCC Version, LAMMPS Version, etc

No response

Details

ARNING:tensorflow:From /home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step) Traceback (most recent call last): File "/home/user/anaconda3/envs/deepmd/bin/dp", line 10, in sys.exit(main()) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/main.py", line 656, in main deepmd_main(args) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 74, in main train_dp(dict_args) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 149, in train jdata = update_sel(jdata) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 512, in update_sel jdata_cpy["model"] = Model.update_sel(jdata, jdata["model"]) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/model/model.py", line 566, in update_sel return cls.update_sel(global_jdata, local_jdata) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/model/model.py", line 723, in update_sel local_jdata_cpy["descriptor"] = Descriptor.update_sel( File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/descriptor/descriptor.py", line 511, in update_sel return cls.update_sel(global_jdata, local_jdata) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se.py", line 162, in update_sel return update_one_sel(global_jdata, local_jdata_cpy, False) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 479, in update_one_sel tmp_sel = get_sel( File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 440, in getsel , max_nbor_size = get_nbor_stat(jdata, rcut, one_type=one_type) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 390, in get_nbor_stat train_data = get_data( File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 323, in get_data data = DeepmdDataSystem( File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/data_system.py", line 100, in init DeepmdData( File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/data.py", line 76, in init self.atom_type = self._load_type(root) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/data.py", line 583, in _load_type atom_type = (sys_path / "type.raw").load_txt(ndmin=1).astype(np.int32) File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/path.py", line 160, in load_txt return np.loadtxt(str(self.path), kwargs) File "/home/user/.local/lib/python3.10/site-packages/numpy/lib/npyio.py", line 1373, in loadtxt arr = _read(fname, dtype=dtype, comment=comment, delimiter=delimiter, File "/home/user/.local/lib/python3.10/site-packages/numpy/lib/npyio.py", line 1016, in _read arr = _load_from_filelike( File "/home/user/anaconda3/envs/deepmd/lib/python3.10/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte

njzjz commented 3 days ago

Did you generate the file using a different encoding other than UTF-8?

umang4002 commented 2 days ago

Does the format of the coord.npy file as 73.57 and 7.357+e1 is creating this error? I am using dump files from lammps to create coord.npy, force.npy, box.npy and energy.npy for correspondin frame.

umang4002 commented 2 days ago

Using dp data the coord generated file contains the format 7.357+e1 but manually creating the same files the format was the former one. Can this be the source of error?

umang4002 commented 2 days ago

And if my dump file is in the format id Type x y z fx fy fz. Then using the command

dsys = dpdata.System("/content/dump.w_ni_fe_1900.lammpstrj", fmt="lammps/dump") dsys.to("deepmd/npy", "deepmd_data", set_size=dsys.get_nframes())

is only creating the coord.npy, type.raw type_map.raw and box.npy.

In spite of having fx fy fz in the dump file, the above command is not able to account for the forces to create force.npy file.

To solve this issue, I have to extract the corresponding data manually, which results in the error. The shape of data is the same after using update and manually extracting data from frames.

I think the error is because https://github.com/deepmodeling/dpdata/tree/master/dpdata/lammps/dump.py does not have any code snippet for force when using lammps/dump format.

umang4002 commented 2 days ago

This is the data format I am using for training

energy = [-245065.86684099, -245043.93657998, -245078.53004939, -245036.34437463, -245108.69610125, -245173.16775741, -245215.20799936, -245135.88491991, -245208.40294702,........,]

force = array([[ 0.722155 , 1.66346 , 0.182122 , ..., -0.116272 , 0.485847 , -1.33518 ], [-0.83064 , -1.31154 , 0.387978 , ..., 6.23634 , 6.10007 , -1.11442 ], [-0.142134 , 1.41269 , -2.4242 , ..., -4.30655 , 0.709387 , -1.23845 ], ..., [ 0.0186577, 0.150581 , -0.727243 , ..., -0.502336 , -0.660103 , -1.29336 ], [ 1.28235 , -0.375996 , 0.200156 , ..., 1.17807 , 1.61571 , 0.646577 ], [ 1.18955 , 1.91557 , 0.920073 , ..., -1.23146 , 1.72745 , 3.02438 ]])

coord = ([[69.1717 , 59.716 , 4.91938, ..., 57.8245 , 48.3722 , 14.392 ], [69.5761 , 60.1495 , 74.4595 , ..., 57.8314 , 48.1864 , 14.312 ], [69.5075 , 60.0246 , 5.08948, ..., 57.8827 , 48.7315 , 14.1056 ], ..., [68.4911 , 59.1742 , 73.6108 , ..., 56.6714 , 46.084 , 12.5388 ], [68.1387 , 59.1014 , 73.3706 , ..., 56.4182 , 46.2359 , 12.5394 ], [68.2556 , 58.8982 , 73.3789 , ..., 56.6145 , 45.388 , 12.4219 ]])

box = array([[69.60792272, 0. , 0. , 0. , 69.60792272,

  1. , 0. , 0. , 69.60792272],...............................]])
umang4002 commented 2 days ago

However I ran another program with different data and this ran.

This is the data which is running properly

energy = [-245065.86684099, -245043.93657998, -245078.53004939, -245036.34437463, -245108.69610125, -245173.16775741, -245215.20799936, -245135.88491991, -245208.40294702,........,]

force = array([[ 3.6391 , 2.20297 , 3.04376 , ..., 0.165536 , 0.121503 , -0.939756 ], [-0.955997 , -1.82867 , -0.320071 , ..., 5.49932 , -0.213522 , 2.20919 ], [ 1.86836 , 2.75882 , -0.457042 , ..., 4.37476 , -1.67104 , -0.68138 ], ..., [-0.111075 , -0.0626644, -0.277481 , ..., -0.744622 , 0.755648 , -0.780645 ], [-0.399858 , -1.18326 , -1.75817 , ..., -1.61044 , -0.189367 , 1.03649 ], [-0.48448 , -3.57668 , -0.724496 , ..., 4.93316 , 1.73451 , -0.469619 ]])

coord = array([[6.70427000e-02, 2.46912000e-01, 2.05648000e-02, ..., 6.99589000e+01, 6.99504000e+01, 6.97164000e+01], [2.63638545e-01, 4.02347945e-01, 7.08109239e+01, ..., 7.02872239e+01, 7.04596239e+01, 6.95565239e+01], [2.59688357e-01, 6.82876357e-01, 7.07703334e+01, ..., 7.02759334e+01, 6.91589334e+01, 7.01310334e+01], ..., [6.93155176e+01, 6.87564176e+01, 6.91942176e+01, ..., 6.66071176e+01, 6.81565176e+01, 7.05844176e+01], [6.93936865e+01, 6.92003865e+01, 6.94609865e+01, ..., 6.68730865e+01, 6.78987865e+01, 7.03048865e+01], [6.92384739e+01, 6.84930739e+01, 6.88023739e+01, ..., 6.60239739e+01, 6.76421739e+01, 3.05772869e-01]])

box = array([[71.375 , 0. , 0. , 0. , 71.375 ,

  1. , 0. , 0. , 71.375 ],,...............................]])