QData / spacetimeformer

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."
https://arxiv.org/abs/2109.12218
MIT License
808 stars 191 forks source link

`configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"} #94

Open pdy265 opened 8 months ago

pdy265 commented 8 months ago

when using “python ./spacetimeformer/train.py spacetimeformer mnist --embed_method spatio-temporal --local_self_attn full --local_cross_attn full --global_self_attn full --global_cross_attn full --run_name mnist_spatiotemporal --context_points 10 --gpus 0” it occurs this problem:

Traceback (most recent call last):
  File "./spacetimeformer/train.py", line 869, in <module>
    main(args)
  File "./spacetimeformer/train.py", line 849, in main
    trainer.fit(forecaster, datamodule=data_module)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
    self._call_and_handle_interrupt(
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 724, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1218, in _run
    self.strategy.setup(self)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/strategies/dp.py", line 70, in setup
    super().setup(trainer)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 139, in setup
    self.setup_optimizers(trainer)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 128, in setup_optimizers
    self.optimizers, self.lr_scheduler_configs, self.optimizer_frequencies = _init_optimizers_and_lr_schedulers(
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 203, in _init_optimizers_and_lr_schedulers
    _configure_schedulers_automatic_opt(lr_schedulers, monitor)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 318, in _configure_schedulers_automatic_opt
    raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: `configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}

how to deal with it?

anibalpedraza commented 7 months ago

I am facing the same problem, the environment seems to have the packages with a proper version:

absl-py 2.1.0 aiohttp 3.9.3 aiosignal 1.3.1 antlr4-python3-runtime 4.9.3 appdirs 1.4.4 async-timeout 4.0.3 attrs 23.2.0 axial-positional-embedding 0.2.1 cachetools 5.3.3 certifi 2024.2.2 cftime 1.6.3 chardet 5.2.0 charset-normalizer 3.3.2 click 8.1.7 cmdstanpy 0.9.68 colorama 0.4.6 contourpy 1.1.1 convertdate 2.4.0 cycler 0.12.1 Cython 3.0.10 docker-pycreds 0.4.0 einops 0.7.0 filelock 3.13.3 fonttools 4.50.0 frozenlist 1.4.1 fsspec 2024.3.1 gitdb 4.0.11 GitPython 3.1.43 google-auth 2.29.0 google-auth-oauthlib 1.0.0 grpcio 1.62.1 idna 3.6 importlib-metadata 7.1.0 importlib-resources 6.4.0 Jinja2 3.1.3 joblib 1.3.2 kiwisolver 1.4.5 local-attention 1.9.0 Markdown 3.6 MarkupSafe 2.1.5 matplotlib 3.7.5 mpmath 1.3.0 multidict 6.0.5 netCDF4 1.6.5 networkx 3.1 numpy 1.24.4 nystrom-attention 0.0.11 oauthlib 3.2.2 omegaconf 2.3.0 opencv-python 4.9.0.80 opt-einsum 3.3.0 packaging 24.0 pandas 2.0.3 performer-pytorch 1.1.4 pillow 10.3.0 pip 21.1.1 protobuf 4.25.3 psutil 5.9.8 pyasn1 0.6.0 pyasn1-modules 0.4.0 pyDeprecate 0.3.2 PyMeeus 0.5.12 pyparsing 3.1.2 pystan 2.19.1.1 python-dateutil 2.9.0.post0 pytorch-lightning 1.6.0 pytz 2024.1 PyYAML 6.0.1 requests 2.31.0 requests-oauthlib 2.0.0 rsa 4.9 scikit-learn 1.3.2 scipy 1.10.1 seaborn 0.13.2 sentry-sdk 1.44.1 setproctitle 1.3.3 setuptools 56.0.0 six 1.16.0 smmap 5.0.1 spacetimeformer 1.5.0
sympy 1.12 tensorboard 2.14.0 tensorboard-data-server 0.7.2 threadpoolctl 3.4.0 torch 1.11.0+cu113 torchaudio 0.11.0+cu113 torchmetrics 0.5.1 torchvision 0.12.0+cu113 tqdm 4.66.2 typing-extensions 4.10.0 tzdata 2024.1 ujson 5.9.0 urllib3 2.2.1 wandb 0.16.6 werkzeug 3.0.2 wheel 0.43.0 yarl 1.9.4 zipp 3.18.1

anibalpedraza commented 7 months ago

I have managed to solve the problem based on the answer provided in https://github.com/QData/spacetimeformer/issues/79#issuecomment-2017100498

I add some changes to that answer, considering the definition of the variables should be done using "self". The way to solve it is to modify the spacetimeformer_model.py. It's under spacetimeformer/spacetimeformer_model. In the function configure_optimizers, so it seems like that now:

def configure_optimizers(self):
    self.optimizer = torch.optim.AdamW(
        self.parameters(),
        lr=self.base_lr,
        weight_decay=self.l2_coeff,
        )
    self.scheduler = stf.lr_scheduler.WarmupReduceLROnPlateau(
        self.optimizer,
        init_lr=self.init_lr,
        peak_lr=self.base_lr,
        warmup_steps=self.warmup_steps,
        patience=3,
        factor=self.decay_factor,
    )

    monitor = 'val/loss'

    return {
        "optimizer": self.optimizer,
        "lr_scheduler": {
            "scheduler": self.scheduler,
            "monitor": monitor
        }
    }