Open Markus28 opened 1 month ago
I found that it is possible to avoid the exception by commenting out the lines here:
https://github.com/cvignac/DiGress/blob/7a36a84103a6e4b732953459515a479f12e8ff3b/src/main.py#L153
However, it is still unclear to me how many samples we should use to faithfully reproduce the results from the paper. The config experiments/planar.yaml
says 40, while general_default.yaml
says 10k. The former would lead to large variances in the evaluation results, while the latter would take quite long to evaluate (roughly 10 hours on an H100, I believe).
I get the same problem, it seems that if you train the model using the distributed way(via ddp), you can not load the checkpoint by directly using from_checkpoint
and everything must go through the Trainer.
line 155 only retrieve the configuration, if you have the exact hydra setup with when you training the model it should be fine.
another comment on this issue: I am wondering is there a stable branch when there is no distributed traininng???I would be more convenient since I want to tweak the samping method and evaluating matrics. Thanks in advance.
I was wondering how the trained models are intended to be evaluated. I don't believe that the paper states how many samples were used to compute the metrics. The code appears to give some indication but the testing functionality seems broken. Assuming we train a model via:
I would expect that we evaluate it on the test set via:
Unfortunately, this functionality is broken and gives this stack trace:
I am using
torch==2.0.1+cu118
andpytorch-lightning==2.0.4
, as specified in the requirements.So how are we actually supposed to evaluate the model? I think some instructions in the README would be valuable. Thanks for your help!