Open Youhaojen opened 6 months ago
Dear Youhaojen,
The number of atoms in your database is too large. Mg2Ge contains 96 atoms, and the memory required for training this structure exceeds that of the RTX 3090. You can try using smaller features first, such as 32x0o+128x0e+32x1o+32x1e+32x2e+32x2o+32x3o+32x3e+32x4o+32x4e
. If it still exceeds the GPU's memory, then you will have to use a smaller structure (less than 40 atoms in the unit cell, I guess) for the training set. By the way, considering the limited size of your training set with only 60 structures, employing multiple GPUs may not be necessary.
Best wishes, Yang Zhong
Dear Yang Zhong,
Thanks for your kind reply. The 96 atoms of Mg2Ge data are too large for RTX 3090. Therefore, I used 200 data of unitcell for the training model. It is working now.
Thank you again.
Best, Hao-Jen You
Dear Zhong Yang,
I tried to train the model with 3 RTX 3090. The training data are generated via
abacus
, which includes 60 structures of 96 atoms Mg2Ge. I get an Error when running the training process, there's the full error message below:Following is my config.yaml.
I suspect the number of atoms is too large in my database, which causes the error, but I'm not entirely certain of the exact reason. Could you provide some guidance?