Open ErikZ719 opened 6 days ago
Hi @ErikZ719
Thanks for your feedback.
As LLaVA official repo states, if you do not have enough gpu memory for LLaVA training, please consider,
1) Use LoRA: finetune_lora.sh. As LLaVA indicates, 7B training can be fitted in 8-RTX3090 (I runned lora training on 4090 and seems work well). Make sure per_device_train_batch_size
*gradient_accumulation_steps
is the same as the provided script for best reproducibility.
2) Replace zero3.json
with zero3_offload.json
which offloads some parameters to CPU RAM. This slows down the training speed.
Thank you very much for your reply. Would it be possible to show your cuda version and “pip list” on the 4090? I'm getting a version conflict error with my lora training.
Hi @ErikZ719
Please check below for your reference.
pip list
Package Version Editable project location
----------------------------- ----------- ---------------------------------------
accelerate 0.26.1
aiofiles 23.2.1
aiohappyeyeballs 2.4.3
aiohttp 3.10.8
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.6.0
asttokens 2.4.1
async-timeout 4.0.3
attrs 24.2.0
av 13.0.0
bitsandbytes 0.44.1
black 24.1.0
bleach 6.1.0
blis 0.7.11
braceexpand 0.1.7
Brotli 1.0.9
cachetools 5.3.3
catalogue 2.0.10
certifi 2024.8.30
cffi 1.16.0
cfgv 3.4.0
chardet 5.2.0
charset-normalizer 3.3.2
click 8.1.7
cloudpathlib 0.16.0
colorama 0.4.6
confection 0.1.4
contexttimer 0.3.3
contourpy 1.3.0
cycler 0.12.1
cymem 2.0.8
DataProperty 1.0.1
datasets 2.16.1
decorator 4.4.2
decord 0.6.0
deepspeed 0.13.1
diffusers 0.16.0
dill 0.3.7
distlib 0.3.8
distro 1.9.0
docker-pycreds 0.4.0
easydict 1.9
einops 0.6.1
einops-exts 0.0.4
et-xmlfile 1.1.0
evaluate 0.4.3
exceptiongroup 1.2.0
executing 2.0.1
fairscale 0.4.4
fastapi 0.115.0
ffmpy 0.4.0
filelock 3.13.1
fonttools 4.54.1
frozenlist 1.4.1
fsspec 2023.10.0
ftfy 6.1.3
gitdb 4.0.11
GitPython 3.1.43
gmpy2 2.1.2
gradio 4.16.0
gradio_client 0.8.1
h11 0.14.0
h5py 3.10.0
hf_transfer 0.1.8
hjson 3.1.0
httpcore 0.16.3
httpx 0.23.3
huggingface-hub 0.25.1
identify 2.5.35
idna 3.7
imageio-ffmpeg 0.4.9
importlib_resources 6.4.5
iopath 0.1.10
ipython 8.22.1
isort 5.13.2
jedi 0.19.1
Jinja2 3.1.4
jiter 0.5.0
joblib 1.3.2
jsonlines 4.0.0
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
kaggle 1.6.6
kiwisolver 1.4.7
langcodes 3.3.0
latex2mathml 3.77.0
lazy_loader 0.3
llava 1.2.2.post1 /home/xingyun/xingy/cca-llava
lmms_eval 0.2.4 /home/xingyun/xingy/cca-llava/lmms-eval
loguru 0.7.2
lxml 5.3.0
markdown-it-py 3.0.0
markdown2 2.5.0
MarkupSafe 2.1.3
matplotlib 3.9.2
matplotlib-inline 0.1.6
mbstrdecoder 1.1.3
mdurl 0.1.2
mkl_fft 1.3.10
mkl_random 1.2.7
mkl-service 2.4.0
moviepy 1.0.3
mpmath 1.3.0
multidict 6.1.0
multiprocess 0.70.15
murmurhash 1.0.10
mutagen 1.47.0
mypy-extensions 1.0.0
networkx 3.2.1
ninja 1.11.1.1
nltk 3.8.1
nodeenv 1.8.0
numexpr 2.10.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nvjitlink-cu12 12.3.101
nvidia-nvtx-cu12 12.1.105
omegaconf 2.3.0
openai 1.51.0
opencv-python-headless 4.10.0.84
opendatasets 0.1.22
openpyxl 3.1.5
orjson 3.10.7
packaging 24.1
pandas 2.2.3
parso 0.8.3
pathspec 0.12.1
pathvalidate 3.2.1
peft 0.13.0
pillow 10.4.0
pip 24.2
platformdirs 4.2.0
portalocker 2.8.2
pre-commit 3.6.2
preshed 3.0.9
proglog 0.1.10
prompt-toolkit 3.0.43
protobuf 3.20.0
psutil 6.0.0
pure-eval 0.2.2
py-cpuinfo 9.0.0
pyarrow 15.0.0
pyarrow-hotfix 0.6
pybind11 2.13.6
pycocoevalcap 1.2
pycocotools 2.0.8
pycparser 2.21
pycryptodomex 3.21.0
pydantic 2.9.2
pydantic_core 2.23.4
pydeck 0.8.1b0
pydub 0.25.1
Pygments 2.17.2
pynvml 11.5.0
pyparsing 3.1.4
PySocks 1.7.1
pytablewriter 1.2.0
python-dateutil 2.9.0.post0
python-magic 0.4.27
python-multipart 0.0.12
python-slugify 8.0.4
pytz 2024.2
PyYAML 6.0.1
pyyaml_env_tag 0.1
referencing 0.35.1
regex 2024.9.11
requests 2.32.3
rfc3986 1.5.0
rich 13.9.1
rpds-py 0.20.0
ruff 0.6.8
sacrebleu 2.4.3
safetensors 0.4.5
scikit-image 0.22.0
scikit-learn 1.2.2
scipy 1.14.1
seaborn 0.13.2
semantic-version 2.10.0
sentencepiece 0.1.99
sentry-sdk 2.14.0
setproctitle 1.3.3
setuptools 75.1.0
shellingham 1.5.4
shortuuid 1.0.13
six 1.16.0
smart-open 6.4.0
smmap 5.0.1
sniffio 1.3.1
soundfile 0.12.1
spacy-legacy 3.0.12
spacy-loggers 1.0.5
sqlitedict 2.1.0
srsly 2.4.8
stack-data 0.6.3
starlette 0.38.6
streamlit 1.31.1
svgwrite 1.4.3
sympy 1.12
tabledata 1.3.3
tabulate 0.9.0
tcolorpy 0.1.6
tenacity 8.3.0
tensorboardX 2.6.2.2
text-unidecode 1.3
threadpoolctl 3.5.0
tifffile 2024.2.12
tiktoken 0.7.0
timm 0.6.13
tokenizers 0.15.2
toml 0.10.2
tomli 2.0.2
tomlkit 0.12.0
toolz 0.12.1
torch 2.1.1
torchvision 0.16.1
tornado 6.4
tqdm 4.66.5
tqdm-multiprocess 0.0.11
traitlets 5.14.1
transformers 4.37.2
transformers-stream-generator 0.0.5
triton 2.1.0
typepy 1.3.2
typer 0.12.5
typing_extensions 4.11.0
tzdata 2024.2
tzlocal 5.2
urllib3 2.2.3
uvicorn 0.31.0
validators 0.22.0
virtualenv 20.25.1
wandb 0.18.2
wasabi 1.1.2
watchdog 4.0.0
wavedrom 2.0.3.post3
wcwidth 0.2.13
weasel 0.3.4
webencodings 0.5.1
websockets 13.1
wheel 0.44.0
xformers 0.0.23
xxhash 3.5.0
yarl 1.13.1
yt-dlp 2024.9.27
zss 1.2.0
zstandard 0.23.0
cuda
import torch
print(torch.version.cuda) # 12.1
What can i say! Man,thank you very much. : )
Hi, I follow the default settings (pyproject.toml) to do fine tuning experiments on 3X3090 and it reports out of memory, is this normal? Pre-training works fine. `Package Version Editable project location
absl-py 2.1.0 accelerate 0.26.1 aiofiles 23.2.1 altair 5.4.1 annotated-types 0.7.0 anyio 4.6.2.post1 attrs 24.2.0 bitsandbytes 0.44.1 certifi 2022.12.7 charset-normalizer 2.1.1 click 8.1.7 contourpy 1.3.0 cycler 0.12.1 deepspeed 0.13.1 docker-pycreds 0.4.0 einops 0.6.1 einops-exts 0.0.4 exceptiongroup 1.2.2 fastapi 0.115.4 ffmpy 0.4.0 filelock 3.13.1 flash-attn 2.5.8 fonttools 4.54.1 fsspec 2024.2.0 gitdb 4.0.11 GitPython 3.1.43 gradio 4.16.0 gradio_client 0.8.1 grpcio 1.67.1 h11 0.14.0 hjson 3.1.0 httpcore 0.17.3 httpx 0.24.0 huggingface-hub 0.26.2 idna 3.4 importlib_resources 6.4.5 Jinja2 3.1.3 joblib 1.4.2 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 kiwisolver 1.4.7 latex2mathml 3.77.0 llava 1.2.2.post1 /root/zqy/cca-llava Markdown 3.7 markdown-it-py 3.0.0 markdown2 2.5.1 MarkupSafe 2.1.5 matplotlib 3.9.2 mdurl 0.1.2 mpmath 1.3.0 narwhals 1.12.1 networkx 3.2.1 ninja 1.11.1.1 numpy 1.26.3 orjson 3.10.11 packaging 24.1 pandas 2.2.3 peft 0.13.2 pillow 10.2.0 pip 24.3.1 platformdirs 4.3.6 protobuf 5.28.3 psutil 6.1.0 py-cpuinfo 9.0.0 pydantic 2.9.2 pydantic_core 2.23.4 pydub 0.25.1 Pygments 2.18.0 pynvml 11.5.0 pyparsing 3.2.0 python-dateutil 2.9.0.post0 python-multipart 0.0.17 pytz 2024.2 PyYAML 6.0.2 referencing 0.35.1 regex 2024.9.11 requests 2.28.1 rich 13.9.4 rpds-py 0.20.1 ruff 0.7.2 safetensors 0.4.5 scikit-learn 1.2.2 scipy 1.14.1 semantic-version 2.10.0 sentencepiece 0.1.99 sentry-sdk 2.17.0 setproctitle 1.3.3 setuptools 75.1.0 shellingham 1.5.4 shortuuid 1.0.13 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 starlette 0.41.2 svgwrite 1.4.3 sympy 1.13.1 tensorboard 2.18.0 tensorboard-data-server 0.7.2 threadpoolctl 3.5.0 timm 0.6.13 tokenizers 0.15.2 tomlkit 0.12.0 torch 2.1.1+cu121 torchaudio 2.1.1+cu121 torchvision 0.16.1+cu121 tqdm 4.66.6 transformers 4.37.2 triton 2.1.0 typer 0.12.5 typing_extensions 4.12.2 tzdata 2024.2 urllib3 1.26.13 uvicorn 0.32.0 wandb 0.18.5 wavedrom 2.0.3.post3 websockets 11.0.3 Werkzeug 3.1.1 wheel 0.44.0 xformers 0.0.23 `