error in train with ptlflow_demo_train.ipynb

Cyrus1993 commented 2 years ago

Hi, When I run this code in Colab. I get the following error. Please advise. I did not change anything.

!python train.py raft_small \ --gpus 1 \ --train_dataset overfit-sintel \ --val_dataset none \ --train_batch_size 1 \ --max_epochs 100 \ --lr 1e-3

ERROR: torch_scatter not found. CSV requires torch_scatter library to run. Check instructions at: https://github.com/rusty1s/pytorch_scatter Global seed set to 1234 GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] 01/02/2022 08:49:01 - WARNING: --train_crop_size is not set. It will be set as (432, 1024). 01/02/2022 08:49:01 - INFO: Loading 1 samples from Sintel_clean dataset. Traceback (most recent call last): File "train.py", line 152, in train(args) File "train.py", line 111, in train trainer.fit(model) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 732, in fit self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt return trainer_fn(*args, *kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 768, in _fit_impl results = self._run(model, ckpt_path=ckpt_path) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1155, in _run self.strategy.setup(self) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/single_device.py", line 76, in setup super().setup(trainer) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 118, in setup self.setup_optimizers(trainer) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 108, in setup_optimizers self.lightning_module File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/optimizer.py", line 174, in _init_optimizers_and_lr_schedulers optim_conf = model.trainer._call_lightning_module_hook("configure_optimizers", pl_module=model) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1535, in _call_lightning_module_hook output = fn(args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/ptlflow/models/base_model/base_model.py", line 358, in configure_optimizers optimizer, self.args.lr, total_steps=self.args.max_steps, pct_start=0.05, cycle_momentum=False, anneal_strategy='linear') File "/usr/local/lib/python3.7/dist-packages/torch/optim/lr_scheduler.py", line 1452, in init raise ValueError("Expected positive integer total_steps, but got {}".format(total_steps)) ValueError: Expected positive integer total_steps, but got -1

hmorimitsu commented 2 years ago

Hi, thank you for reporting this error.

This was caused by a change in the default value of max_steps in newer versions of PyTorch Lightning. The code in this repo was already updated, but the version in PyPI was not.

I just pushed a new version to PyPI, and the colab example should work now. If it doesn't, please check if the ptlflow version being installed is the 0.2.5. If it isn't, you can try to force to install the new version with pip install --upgrade ptlflow.

I hope that helps.

Cyrus1993 commented 2 years ago

Hi. excellent. It works exactly right. Thank you so much for developing this awesome repository.

hmorimitsu / ptlflow

error in train with ptlflow_demo_train.ipynb #31