Shivanandroy / simpleT5

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.
MIT License
382 stars 61 forks source link

AttributeError: 'LightningDataModule' object has no attribute '_has_setup_TrainerFn.FITTING' #57

Closed tr4wzified closed 1 year ago

tr4wzified commented 1 year ago

Hello, I'm new to this whole fine-tuning thing but I'd like to give it try for another (C#) project I'm working on, hopefully integrate it in there. I'm following this tutorial (great write-up by the way!) but I'm having some issues trying to train my model. Here's the code I've got:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from simplet5 import SimpleT5

import os

df = pd.read_csv("Khajiitifier/bin/Debug/net6.0/Translations.csv", encoding='latin-1', sep='|')
df = df[['source_text', 'target_text']]
df['source_text'] = "khajiit: " + df['source_text']
train_df, test_df = train_test_split(df, test_size=0.3)
#print(train_df.shape, test_df.shape)
model = SimpleT5()
model.from_pretrained(model_type="t5", model_name="t5-base")
model.train(train_df=train_df[:5000],
            eval_df=test_df[:100],
            source_max_token_len=128,
            target_max_token_len=50,
            batch_size=8, max_epochs=5, use_gpu=False) # use_gpu=False because my 7900XTX is not recognized

When starting to train the model I get the following error: AttributeError: 'LightningDataModule' object has no attribute'_has_setup_TrainerFn.FITTING'. Full stacktrace:

Global seed set to 42
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Traceback (most recent call last):
  File "C:\Users\tik\source\repos\Khajiitifier\simplet5sample.py", line 15, in <module>
    model.train(train_df=train_df[:5000],
  File "C:\Users\tik\AppData\Local\Programs\Python\Python311\Lib\site-packages\simplet5\simplet5.py", line 395, in train
    trainer.fit(self.T5Model, self.data_module)
  File "C:\Users\tik\AppData\Roaming\Python\Python311\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "C:\Users\tik\AppData\Roaming\Python\Python311\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\tik\AppData\Roaming\Python\Python311\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "C:\Users\tik\AppData\Roaming\Python\Python311\site-packages\pytorch_lightning\trainer\trainer.py", line 1138, in _run
    self._call_setup_hook()  # allow user to setup lightning_module in accelerator environment
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\tik\AppData\Roaming\Python\Python311\site-packages\pytorch_lightning\trainer\trainer.py", line 1438, in _call_setup_hook
    self.datamodule.setup(stage=fn)
  File "C:\Users\tik\AppData\Roaming\Python\Python311\site-packages\pytorch_lightning\core\datamodule.py", line 461, in wrapped_fn
    has_run = getattr(obj, attr)
              ^^^^^^^^^^^^^^^^^^
AttributeError: 'LightningDataModule' object has no attribute '_has_setup_TrainerFn.FITTING'

I'm wondering if I've got some dependency issue, or am I doing something else wrong? I'm using Python 3.11.4 on Windows 11 64-bit.

tr4wzified commented 1 year ago

I managed to get it to work after using Pop OS (Linux) instead of Windows, not sure what this error was.

alexanderfroeber commented 2 months ago

Maybe someone is still interested in this issue...

I use pytorch-lightning == 1.5.10. This works with Python 3.10.x but not with 3.11.x.

The problem is that Python 3.11 changed the way how Enums are formatted. PyTorch-Lightning does the following: name = "setup" stage = TrainerFn.FITTING attr = f"has{name}_{stage}" has_run = getattr(obj, attr)

where TrainerFn is an Enum: class TrainerFn(LightningEnum): FITTING = "fit" VALIDATING = "validate" ...

In Python 3.10.x this results in '_has_setup_fit'. In Python >=3.11.0 this leads to '_has_setup_TrainerFn.FITTING'