Closed LoFiApostasy closed 2 months ago
Thanks for the feedback. Can u install accelerate
== 0.21.0 and try again?
Thanks for the swift response! I got training working after some tinkering last night and I used my "phone a friend" who helped me iterate through it. I hope this helps others, please close this one out. This seems repeatable so far, working on pure linux and WSL:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install transformers==4.33.2 timm==0.4.12 sentencepiece==0.1.99 gradio==4.13.0 markdown2==2.4.10 xlsxwriter==3.1.2 einops
pip install deepspeed peft
here are my current package versions:
$ pip3 list
Package Version
------------------------- ------------
accelerate 0.29.3
aiofiles 23.2.1
aiohttp 3.9.5
aiosignal 1.3.1
altair 5.3.0
annotated-types 0.6.0
anyio 4.3.0
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
cmake 3.25.0
contourpy 1.2.1
cycler 0.12.1
datasets 2.19.0
deepspeed 0.14.1
dill 0.3.8
einops 0.7.0
exceptiongroup 1.2.1
fastapi 0.110.2
ffmpy 0.3.2
filelock 3.13.4
fonttools 4.51.0
frozenlist 1.4.1
fsspec 2024.3.1
gekko 1.1.1
gradio 4.13.0
gradio_client 0.8.0
h11 0.14.0
hjson 3.1.0
httpcore 1.0.5
httpx 0.27.0
huggingface-hub 0.22.2
idna 3.7
importlib_resources 6.4.0
Jinja2 3.1.3
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lit 15.0.7
markdown-it-py 3.0.0
markdown2 2.4.10
MarkupSafe 2.1.5
matplotlib 3.8.4
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.2.1
ninja 1.11.1.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105
orjson 3.10.1
packaging 24.0
pandas 2.2.2
peft 0.10.0
pillow 10.3.0
pip 24.0
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 16.0.0
pyarrow-hotfix 0.6
pydantic 2.7.0
pydantic_core 2.18.1
pydub 0.25.1
Pygments 2.17.2
pynvml 11.5.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
referencing 0.34.0
regex 2024.4.16
requests 2.31.0
rich 13.7.1
rouge 1.0.1
rpds-py 0.18.0
safetensors 0.4.3
semantic-version 2.10.0
sentencepiece 0.1.99
setuptools 68.2.2
shellingham 1.5.4
six 1.16.0
sniffio 1.3.1
starlette 0.37.2
sympy 1.12
timm 0.4.12
tokenizers 0.13.3
tomlkit 0.12.0
toolz 0.12.1
torch 1.13.1+cu117
torchaudio 0.13.1+cu117
torchvision 0.14.1+cu117
tqdm 4.66.2
transformers 4.33.2
triton 2.2.0
typer 0.12.3
typing_extensions 4.11.0
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.29.0
websockets 11.0.3
wheel 0.41.2
XlsxWriter 3.1.2
xxhash 3.4.1
yarl 1.9.4
zipp 3.18.1
Hi, I'm trying to kick off lora training using a fresh Install following this page: https://github.com/InternLM/InternLM-XComposer/blob/main/docs/install.md I hope I'm just overlooking something simple. I included my accelerate config. I saw this error show up elsewhere and it feels like a package version issue. There are no requirements.txt to mirror exactly your training environment, maybe if someone will post a pip list on a working training venv that might help too. Any advice would be welcome. Thanks!