unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.42k stars 1.29k forks source link

AttributeError: 'AdamW' object has no attribute 'train' #1069

Open webbigdata-jp opened 1 month ago

webbigdata-jp commented 1 month ago

This is not bug report. Just report.

It's not unsloth error was not the cause, the title error occurred at the start of training and accelerate seemed to be affecting it.

https://github.com/huggingface/transformers/issues/33620

pip install accelerate==0.34.2
webbigdata-jp commented 1 month ago

I have been able to train without any problems using the following version. Thank you.

$ pip list
Package                  Version
------------------------ ------------
accelerate               0.34.2
aiohttp                  3.9.5
aiosignal                1.3.1
async-timeout            4.0.3
attrs                    23.2.0
bitsandbytes             0.43.1
certifi                  2024.7.4
charset-normalizer       3.3.2
click                    8.1.7
datasets                 2.20.0
dill                     0.3.8
docker-pycreds           0.4.0
docstring_parser         0.16
einops                   0.8.0
filelock                 3.15.4
flash-attn               2.6.3
frozenlist               1.4.1
fsspec                   2024.5.0
gitdb                    4.0.11
GitPython                3.1.43
hf_transfer              0.1.8
huggingface-hub          0.23.4
idna                     3.7
Jinja2                   3.1.4
markdown-it-py           3.0.0
MarkupSafe               2.1.5
mdurl                    0.1.2
mpmath                   1.3.0
multidict                6.0.5
multiprocess             0.70.16
networkx                 3.3
ninja                    1.11.1.1
numpy                    1.26.4
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.20.5
nvidia-nvjitlink-cu12    12.5.82
nvidia-nvtx-cu12         12.1.105
packaging                24.1
pandas                   2.2.2
peft                     0.11.1
pillow                   10.4.0
pip                      22.0.2
platformdirs             4.2.2
protobuf                 3.20.3
psutil                   6.0.0
pyarrow                  16.1.0
pyarrow-hotfix           0.6
Pygments                 2.18.0
python-dateutil          2.9.0.post0
pytz                     2024.1
PyYAML                   6.0.1
regex                    2024.5.15
requests                 2.32.3
rich                     13.7.1
safetensors              0.4.3
sentencepiece            0.2.0
sentry-sdk               2.7.1
setproctitle             1.3.3
setuptools               59.6.0
shtab                    1.7.1
six                      1.16.0
smmap                    5.0.1
sympy                    1.13.0
tokenizers               0.20.0
torch                    2.3.0
tqdm                     4.66.4
transformers             4.45.1
triton                   2.3.0
trl                      0.9.6
typing_extensions        4.12.2
tyro                     0.8.5
tzdata                   2024.1
unsloth                  2024.9.post3
urllib3                  2.2.2
wandb                    0.17.4
wheel                    0.43.0
xformers                 0.0.26.post1
xxhash                   3.4.1
yarl                     1.9.4
Kaushalya commented 1 month ago

Upgrading accelerate to 0.34.2 solved the issue.

danielhanchen commented 1 month ago

Sorry on the delay - I added accelerate>=0.34.2 in pyproject.toml so future installs won't have this issue - thanks for the fixes everyone!

emuchogu commented 1 month ago

Updating unsloth_env_file.yml with: accelerate==0.34.2

in https://github.com/unslothai/unsloth/wiki#nvidia-pascal-support

also works and allows training on nvidia pascal P40 P100

taaaibu commented 1 month ago

Maybe it is irrelevant, because I don't use unsloth, but I had the same issue. In my trainer.py the code

model.train()
        if hasattr(self.optimizer, "train") and callable(self.optimizer.train):
            self.optimizer.train()

caused the issue. I changed it to:

model.train()
        if hasattr(self.optimizer, 'train'):
            self.model.train()

and the training started without an error. Maybe that helps someone. Update accelerate is probably better, but that didn't worked in my case. (I used Oobabooga with Training PRO extension)