Hi,
I tried to run jerex_train.py in a Kaggle environment using the default configurations (except distribution.gpus=[0] distribution.prepare_data_per_node=true, the latter due to a conflict error with DataModule configuration), and got the following error:
Error executing job with overrides: ['distribution.gpus=[0]', 'distribution.prepare_data_per_node=true']
Traceback (most recent call last):
File "jerex_train.py", line 20, in train
model.train(cfg)
File "/kaggle/working/jerex/model.py", line 341, in train
trainer.fit(model, datamodule=data_module)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 769, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 721, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1215, in _run
self.strategy.setup(self)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/strategies/single_device.py", line 72, in setup
super().setup(trainer)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/strategies/strategy.py", line 139, in setup
self.setup_optimizers(trainer)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/strategies/strategy.py", line 129, in setup_optimizers
self.lightning_module
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/optimizer.py", line 180, in _init_optimizers_and_lr_schedulers
optim_conf = model.trainer._call_lightning_module_hook("configure_optimizers", pl_module=model)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1593, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/kaggle/working/jerex/model.py", line 194, in configure_optimizers
dataloader = self.train_dataloader()
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/hooks.py", line 494, in train_dataloader
raise MisconfigurationException("`train_dataloader` must be implemented to be used with the Lightning Trainer")
pytorch_lightning.utilities.exceptions.MisconfigurationException: `train_dataloader` must be implemented to be used with the Lightning Trainer
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Hi, I tried to run jerex_train.py in a Kaggle environment using the default configurations (except distribution.gpus=[0] distribution.prepare_data_per_node=true, the latter due to a conflict error with DataModule configuration), and got the following error: