ulissigroup / amptorch

AMPtorch: Atomistic Machine Learning Package (AMP) - PyTorch
GNU General Public License v3.0
59 stars 35 forks source link

Error in prediction when setting debug to true #96

Closed hsulab closed 3 years ago

hsulab commented 3 years ago

Dear Developers and Users,

When I set the debug to true in the config, the training worked well but an error occurred during the prediction.

    100              0.0202              0.0297        0.0007            0.0119            0.0148        0.0001     +  0.0163
Training completed in 1.969996690750122s
Traceback (most recent call last):
  File "./train_example.py", line 87, in <module>
    predictions = trainer.predict(images)
  File "/users/amptorch/trainer.py", line 245, in predict
    self.descriptor = construct_descriptor(self.config["dataset"]["descriptor"])
KeyError: 'descriptor'

After checking the source code, I found the following lines gave rise to this issue.

https://github.com/ulissigroup/amptorch/blob/bd8af57cdfbe323f08b9480585904c52dc3821f8/amptorch/trainer.py#L128

I don't understand why self.config["dataset"]["descriptor"] is only set when if not self.debug? Should this be set no matter whether debug is on?

Many thanks, Jiayan

mshuaibii commented 3 years ago

Hi Jiayan,

Thanks for catching this. This portion was recently modified and I must've missed testing the debug case. I have updated the code and respective tests to account for this - #97 . Hope this resolves it for you!

FWIW - since debug mode prevents writing checkpoints, etc. to disk, the model trained by will load the parameters of the last epoch rather than the best epoch (that would have been saved to disk if debug=False).

hsulab commented 3 years ago

Hi Jiayan,

Thanks for catching this. This portion was recently modified and I must've missed testing the debug case. I have updated the code and respective tests to account for this - #97 . Hope this resolves it for you!

FWIW - since debug mode prevents writing checkpoints, etc. to disk, the model trained by will load the parameters of the last epoch rather than the best epoch (that would have been saved to disk if debug=False).

Thanks again! This helps me a lot. I will download the new code and the issue can be closed.