Closed QuantumMisaka closed 1 year ago
I guess the error reason may lines in my dataset, which has different atoms number in each structure frame.
But I cannot find a way to load multiple dataset in our yaml input , which can be easily done in deepmd
json-format input file.
The problem lies in the format of loss function. After I changed the format of loss function:
# loss function
loss_coeffs: # different weights to use in a weighted loss functions
forces: # if using PerAtomMSELoss, a default weight of 1:1 on each should work well
- 5
- MSELoss
total_energy:
- 1
- MSELoss
batch_size
can be any number and no problem emerged.
I think the problem is in the funtion of PerAtomMSELoss
You should use
loss_coeffs: # different weights to use in a weighted loss functions
forces: 1.0
total_energy:
- 1.0
- PerAtomMSELoss
instead of
loss_coeffs: # different weights to use in a weighted loss functions
forces: # if using PerAtomMSELoss, a default weight of 1:1 on each should work well
- 1
- PerAtomMSELoss
total_energy:
- 1
PerAtomMSELoss
works for total_energy instead of forces.
@Hongyu-yu is entirely right here; thanks for responding to this!
I've added a better error message for this case.
Describe the bug when I use NequIP to training and validation in a Fe-C-H-O dataset (traindata frame number is 12135 while test data frame number is 1000) with
batchsize
for training is 5 andbatchsize
for validation is 5, error will occur:I have tried lots of different batchsize for training and validation, which all failed unless I set 1 for
batchsize
for training and validation. butbatchsize=1
is so unefficient.To Reproduce This is my yaml file:
Just do
nequip train <yamlfile>
in a normal nequip environmentExpected behavior Training Process running properly as example in NequIP
Environment (please complete the following information):
Additional context Specific data will be uploaded nequip_1000vaild5.log FeCHO_nequip.1.tar.gz FeCHO_nequip.2.tar.gz