brucefan1983 / GPUMD

Graphics Processing Units Molecular Dynamics
https://gpumd.org/dev
GNU General Public License v3.0
466 stars 116 forks source link

STOF error message #423

Closed mayankaditya closed 1 year ago

mayankaditya commented 1 year ago

Hi,

I am trying to train the neural network model using AIMD data with different stoichiometry systems. I am getting the following error:

File: main_nep/structure.cu Line: 145 Error message: stof

Please advise.

Thanks, Mayank

brucefan1983 commented 1 year ago

Could you please specify the GPUMD version? According to the current master, line 145 of main_nep/structure.cu is related to the energy= item in a structure. The error means the code encountered a problem when it tries to read a floating point number. I cannot infer more about the problem. You can also try to read your train.xyz using ASE (https://wiki.fysik.dtu.dk/ase/). If ASE can read it without error, you can send your train.xyz to me via email and I will debug for you.

mayankaditya commented 1 year ago

I am using the recent version, 3.7. I have extracted the input using a multiple-outcar script. Is any updated dee2nep.py script for data conversion compatible with the 3.7 version?

Thanks, Mayank

On Sun, May 14, 2023 at 2:33 PM Zheyong Fan @.***> wrote:

Could you please specify the GPUMD version? According to the current master, line 145 of main_nep/structure.cu is related to the energy= item in a structure. The error means the code encountered a problem when it tries to read a floating point number. I cannot infer more about the problem. You can also try to read your train.xyz using ASE ( https://wiki.fysik.dtu.dk/ase/). If ASE can read it without error, you can send your train.xyz to me via email and I will debug for you.

— Reply to this email directly, view it on GitHub https://github.com/brucefan1983/GPUMD/issues/423#issuecomment-1546848653, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADC5MSVSQHOLAK64UAT73K3XGCNVLANCNFSM6AAAAAAYA7H7BU . You are receiving this because you authored the thread.Message ID: @.***>

brucefan1983 commented 1 year ago

So you have your training data in DeepMD format?

Currently, you need to run two Python scripts to make a full conversion to train.xyz:

The above two-step way is historical, and we will add a deep2xyz.py script soon.

mayankaditya commented 1 year ago

Thanks for the quick response. I use the data format according to GPUMD3.7 (generated using multiple-outcar-script. from GPUMD utility). When I use a less number of frames, the code is working well. But when I used a larger one (12000 frames). It's complaining. I am using AIMD data from 96 and 144 atom simulations.

Thanks, Mayank

On Sun, May 14, 2023 at 3:39 PM Zheyong Fan @.***> wrote:

So you have your training data in DeepMD format?

Currently, you need to run two Python scripts to make a full conversion to train.xyz:

  • step 1: use deep2nep.py to convert to train.in
  • step 2: use nep2xyz.py to convert train.in to train.xyz

The above two-step way is historical, and we will add a deep2xyz.py script soon.

— Reply to this email directly, view it on GitHub https://github.com/brucefan1983/GPUMD/issues/423#issuecomment-1546862228, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADC5MSXNVXQTAGEEGUOH77LXGCVPDANCNFSM6AAAAAAYA7H7BU . You are receiving this because you authored the thread.Message ID: @.***>

brucefan1983 commented 1 year ago

So it is not related to deepmd, but just VASP output.

I still suggest try to read .xyz file using ASE to see if there is any error.

Also NEP does not need so many frames to train. The best way is to sample about every 100 frames from the AIMD trajectory and do more accurate single-point DFT calculations.

Ovito can be very handy too.

brucefan1983 commented 1 year ago

I can only help to debug the GPUMD code as I only maintain it. The scripts in tools/ are mostly contributed by users.

If you can send me a (small size) train.xyz that gets the error, I am happy to debug.

mayankaditya commented 1 year ago

Thank you for the response. I will check again with ASE.

Mayank

On Sun, May 14, 2023 at 7:09 PM Zheyong Fan @.***> wrote:

I can only help to debug the GPUMD code as I only maintain it. The scripts in tools/ are mostly contributed by users.

If you can send me a (small size) train.xyz that gets the error, I am happy to debug.

— Reply to this email directly, view it on GitHub https://github.com/brucefan1983/GPUMD/issues/423#issuecomment-1546903021, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADC5MSTDLV3K2O5SL7SGUVLXGDOBPANCNFSM6AAAAAAYA7H7BU . You are receiving this because you authored the thread.Message ID: @.***>

brucefan1983 commented 1 year ago

Have you solved the problem? Need help for debugging?

mayankaditya commented 1 year ago

Hi,

Yes, the problem is solved. It was an issue with the broken dataset. I could find this and remove that and the code is working fine. Thank you for your quick response and help.

Thanks, Mayank

On Wed, May 17, 2023 at 10:26 PM Zheyong Fan @.***> wrote:

Have you solved the problem? Need help for debugging?

— Reply to this email directly, view it on GitHub https://github.com/brucefan1983/GPUMD/issues/423#issuecomment-1551755954, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADC5MSWCKHGDXKXIIZOCTN3XGT7M3ANCNFSM6AAAAAAYA7H7BU . You are receiving this because you authored the thread.Message ID: @.***>

brucefan1983 commented 1 year ago

That's great. I will then close this issue.