kashif / pytorch-transformer-ts

Repository of Transformer based PyTorch Time Series Models
MIT License
289 stars 41 forks source link

do you have a plan to upgrade pytorch_lightning to lightning? #28

Open hohe12ly opened 10 months ago

hohe12ly commented 10 months ago

I tried to install using the default requirements.txt with python 3.10 and run lag-llama/train.py with the default setting and --seed 0. The run failed.

Apparently, the error indicates the use of pytorch_lightning imports instead of the new from lightning.pytorch import .... Do you plan to upgrade to the new version of lightning?

The installed modules feature lightning 2.1.2, pytorch-lightning-2.1.2, and gluonts-0.14.3. The full list:

MarkupSafe-2.1.3 PasteDeploy-3.1.0 PyYAML-6.0.1 SQLAlchemy-2.0.23 aiohttp-3.9.1 aiosignal-1.3.1 annotated-types-0.6.0 anykeystore-0.2 apex-0.9.10.dev0 async-timeout-4.0.3 attrs-23.1.0 axial-positional-embedding-0.2.1 certifi-2023.11.17 charset-normalizer-3.3.2 colt5-attention-0.10.19 cryptacular-1.6.2 datasets-2.15.0 defusedxml-0.7.1 dill-0.3.7 einops-0.7.0 etsformer-pytorch-0.1.1 fairscale-0.4.0 filelock-3.13.1 frozenlist-1.4.1 fsspec-2023.10.0 gluonts-0.14.3 greenlet-3.0.2 hopfield-layers-1.0.2 huggingface-hub-0.20.1 hupper-1.12 idna-3.6 jinja2-3.1.2 keopscore-2.1.2 lightning-2.1.2 lightning-utilities-0.10.0 local-attention-1.9.0 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.15 networkx-3.2.1 numpy-1.26.2 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.18.1 nvidia-nvjitlink-cu12-12.3.101 nvidia-nvtx-cu12-12.1.105 oauthlib-3.2.2 opt_einsum-3.3.0 orjson-3.9.10 packaging-23.2 pandas-2.1.4 pbkdf2-1.3 pillow-10.1.0 plaster-1.1.2 plaster-pastedeploy-1.0.1 product-key-memory-0.2.10 pyarrow-14.0.2 pyarrow-hotfix-0.6 pybind11-2.11.1 pydantic-2.5.2 pydantic-core-2.14.5 pykeops-2.1.2 pyramid-2.0.2 pyramid-mailer-0.15.1 python-dateutil-2.8.2 python3-openid-3.2.0 pytorch-lightning-2.1.2 pytz-2023.3.post1 reformer_pytorch-1.4.4 repoze.sendmail-4.4.1 requests-2.31.0 requests-oauthlib-1.3.1 scipy-1.11.4 six-1.16.0 sympy-1.12 timm-0.6.13 toolz-0.12.0 torch-2.1.2 torchmetrics-1.2.1 torchscale-0.2.0 torchvision-0.16.2 tqdm-4.66.1 transaction-4.0 translationstring-1.4 triton-2.1.0 typing-extensions-4.9.0 tzdata-2023.3 urllib3-2.1.0 velruse-1.1.1 venusian-3.1.0 webob-1.8.7 wtforms-3.1.1 wtforms-recaptcha-0.3.2 xformers-0.0.23.post1 xxhash-3.4.1 yarl-1.9.4 zope.deprecation-5.0 zope.interface-6.1 zope.sqlalchemy-3.1

The error:

Seed set to 0
/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/gluonts/dataset/common.py:262: FutureWarning: Period with BDay freq is deprecated and will be removed in a future version. Use a DatetimeIndex with BDay freq instead.
  return pd.Period(val, freq)
/mnt/data/home/yxl/test/pytorch-transformer-ts/lag-llama/train.py:109: FutureWarning: Period with BDay freq is deprecated and will be removed in a future version. Use a DatetimeIndex with BDay freq instead.
  'start': x['start'] + i,
command line arguments:
Namespace(seed=0, context_length=256, n_layer=4, n_embd=256, n_head=4, aug_prob=0.5, aug_rate=0.1, batch_size=100, num_batches_per_epoch=100, limit_val_batches=10, max_epochs=1000, test=False, early_stopping_patience=50)
no lightning logs found. Training from scratch.
num_parameters :  3434755
Traceback (most recent call last):
  File "/mnt/data/home/yxl/test/pytorch-transformer-ts/lag-llama/train.py", line 307, in <module>
    train(args)
  File "/mnt/data/home/yxl/test/pytorch-transformer-ts/lag-llama/train.py", line 233, in train
    train_output = estimator.train_model(
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/gluonts/torch/model/estimator.py", line 201, in train_model
    trainer = pl.Trainer(
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/utilities/argparse.py", line 70, in insert_env_defaults
    return fn(self, **kwargs)
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 431, in __init__
    self._callback_connector.on_trainer_init(
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/callback_connector.py", line 79, in on_trainer_init
    _validate_callbacks_list(self.trainer.callbacks)
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/callback_connector.py", line 227, in _validate_callbacks_list
    stateful_callbacks = [cb for cb in callbacks if is_overridden("state_dict", instance=cb)]
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/callback_connector.py", line 227, in <listcomp>
    stateful_callbacks = [cb for cb in callbacks if is_overridden("state_dict", instance=cb)]
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/utilities/model_helpers.py", line 38, in is_overridden
    _check_mixed_imports(instance)
  File "/mnt/data/home/yxl/conda/envs/lagllama_vanilla/lib/python3.10/site-packages/lightning/pytorch/utilities/model_helpers.py", line 71, in _check_mixed_imports
    raise TypeError(
TypeError: You passed a `pytorch_lightning` object (EarlyStopping) to a `lightning.pytorch` Trainer. Please switch to a single import style.
ashok-arjun commented 10 months ago

Hey @hohe12ly, I'll look into this soon and get back!

ashok-arjun commented 10 months ago

Please see the other issue #29 for the requirements I use; please let me know if the error persists.

hohe12ly commented 9 months ago

I followed your instructions to install lightning 2.0.4. I no longer get the reported error. I will stay with lightning 2.0.4. thanks for your help!