deepmodeling / deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics
https://docs.deepmodeling.com/projects/deepmd/
GNU Lesser General Public License v3.0
1.41k stars 487 forks source link

[BUG] PT backend logger error: AttributeError: 'tuple' object has no attribute 'items' #3771

Closed Yi-FanLi closed 1 month ago

Yi-FanLi commented 2 months ago

Bug summary

I am training a model with PyTorch backend. The setup is the same as what I used in the Tensorflow backend. However, it raises the AttributeError.

DeePMD-kit Version

3.0.0a0

Backend and its version

PyTorch v2.0.0.post200-gc263bd43e8e

How did you download the software?

docker

Input Files, Running Commands, Error Log, etc.

Error log:

[2024-05-11 04:58:48,328] DEEPMD INFO found 1 system(s): [2024-05-11 04:58:48,328] DEEPMD INFO system natoms bch_sz n_bch prob pbc [2024-05-11 04:58:48,328] DEEPMD INFO O64H128 192 1 100 1.000e+00 T
[2024-05-11 04:58:48,328] DEEPMD INFO -------------------------------------------------------------------------------------- [2024-05-11 04:58:48,331] DEEPMD INFO Start to train 400000 steps. [2024-05-11 04:58:54,615] DEEPMD INFO batch 0: trn: rmse = 2.36e+01, rmse_e = 5.74e-01, rmse_f = 7.45e-01, lr = 1.00e-03 Traceback (most recent call last): File "/opt/deepmd-kit/bin/dp", line 10, in sys.exit(main()) ^^^^^^ File "/opt/deepmd-kit/lib/python3.11/site-packages/deepmd/main.py", line 805, in main deepmd_main(args) File "/opt/deepmd-kit/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/opt/deepmd-kit/lib/python3.11/site-packages/deepmd/pt/entrypoints/main.py", line 308, in main train(FLAGS) File "/opt/deepmd-kit/lib/python3.11/site-packages/deepmd/pt/entrypoints/main.py", line 280, in train trainer.run() File "/opt/deepmd-kit/lib/python3.11/site-packages/deepmd/pt/train/training.py", line 859, in run step(step_id, model_key) File "/opt/deepmd-kit/lib/python3.11/site-packages/deepmd/pt/train/training.py", line 757, in step format_training_message_per_task( File "/opt/deepmd-kit/lib/python3.11/site-packages/deepmd/loggers/training.py", line 29, in format_training_message_per_task rmse = dict(sorted(rmse.items())) ^^^^^^^^^^ AttributeError: 'tuple' object has no attribute 'items'

Steps to Reproduce

See the tarball.

Further Information, Files, and Links

issue_logger.tar.gz

njzjz commented 2 months ago

I think this issue has been fixed. Could you try the latest commit?

Yi-FanLi commented 1 month ago

I confirm that the current devel branch does not have this issue.