kyonofx / mlcgmd

[TMLR 2023] Simulate time-integrated coarse-grained MD with multi-scale graph neural networks
MIT License
67 stars 9 forks source link

Meet problems when train a new model #4

Closed shuhaom closed 1 year ago

shuhaom commented 1 year ago

Hi,

Thanks for sharing your work and code.

I test the single chain example and it can be well run. But I got errors when trying to train a new data set, like this,

Traceback (most recent call last): File "/home/user/work_msh/multiGNN/Fu2023_origin/graphwm/train.py", line 165, in main run(cfg) File "/home/user/work_msh/multiGNN/Fu2023_origin/graphwm/train.py", line 153, in run trainer.fit(model=model, datamodule=datamodule) File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit self._call_and_handle_interrupt( File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 721, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs) File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch return function(*args, **kwargs) File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run results = self._run_stage() File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage return self._run_train() File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1345, in _run_train self._run_sanity_check() File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1406, in _run_sanity_check val_loop._reload_evaluation_dataloaders() File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 237, in _reload_evaluation_dataloaders self.trainer.reset_val_dataloader() File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1965, in reset_val_dataloader self.num_val_batches, self.val_dataloaders = self._data_connector._reset_eval_dataloader( File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 372, in _reset_eval_dataloader dataloaders = self._request_dataloader(mode, model=model) File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 459, in _request_dataloader dataloader = source.dataloader() File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 536, in dataloader return method() File "/home/user/work_msh/multiGNN/Fu2023_origin/graphwm/data/datamodule.py", line 84, in val_dataloader return [ File "/home/user/work_msh/multiGNN/Fu2023_origin/graphwm/data/datamodule.py", line 85, in <listcomp> DataLoader( File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 351, in __init__ sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type] File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/torch/utils/data/sampler.py", line 106, in __init__ if not isinstance(self.num_samples, int) or self.num_samples <= 0: File "/home/user/.conda/envs/gnn_cuda/lib/python3.10/site-packages/torch/utils/data/sampler.py", line 114, in num_samples return len(self.data_source) ValueError: __len__() should return >= 0