hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
32.65k stars 4k forks source link

GLM4 sft跑到 epoch 0.45的时候,train loss 开始就是0了 #4084

Closed maiqingqiang closed 4 months ago

maiqingqiang commented 4 months ago

Reminder

System Info

Package Version Editable project location


accelerate 0.30.1 aiofiles 23.2.1 aiohttp 3.9.5 aiosignal 1.3.1 altair 5.3.0 annotated-types 0.6.0 anyio 4.3.0 async-timeout 4.0.3 attrs 23.2.0 auto_gptq 0.7.1 bitsandbytes 0.43.1 blinker 1.8.2 cachetools 5.3.3 certifi 2024.2.2 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 cmake 3.29.3 coloredlogs 15.0.1 contourpy 1.2.1 cycler 0.12.1 dataclasses-json 0.6.6 datasets 2.19.2 deepdiff 7.0.1 deepspeed 0.14.0 dill 0.3.7 diskcache 5.6.3 distro 1.9.0 dnspython 2.6.1 docstring_parser 0.16 einops 0.8.0 email_validator 2.1.1 exceptiongroup 1.2.1 fastapi 0.111.0 fastapi-cli 0.0.3 ffmpy 0.3.2 filelock 3.14.0 fire 0.6.0 fonttools 4.51.0 frozenlist 1.4.1 fsspec 2024.3.1 gekko 1.1.1 gitdb 4.0.11 GitPython 3.1.43 gradio 4.31.3 gradio_client 0.16.3 greenlet 3.0.3 h11 0.14.0 hjson 3.1.0 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.0 humanfriendly 10.0 idna 3.7 importlib_resources 6.4.0 interegular 0.3.3 jieba 0.42.1 Jinja2 3.1.4 joblib 1.4.2 jsonpatch 1.33 jsonpointer 2.4 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 langchain 0.1.20 langchain-community 0.0.38 langchain-core 0.1.52 langchain-text-splitters 0.0.2 langsmith 0.1.59 lark 1.1.9 llamafactory 0.7.2.dev0 /root/LLaMA-Factory llmtuner 0.7.2.dev0 /root/LLaMA-Factory llvmlite 0.42.0 lm-format-enforcer 0.10.1 markdown-it-py 3.0.0 MarkupSafe 2.1.5 marshmallow 3.21.2 matplotlib 3.9.0 mdurl 0.1.2 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.15 mypy-extensions 1.0.0 nest-asyncio 1.6.0 networkx 3.3 ninja 1.11.1.1 nltk 3.8.1 numba 0.59.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.550.52 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 openai 1.30.1 optimum 1.20.0 ordered-set 4.1.0 orjson 3.10.3 outlines 0.0.34 packaging 23.2 pandas 2.2.2 peft 0.11.1 pillow 10.3.0 pip 24.0 prometheus_client 0.20.0 prometheus-fastapi-instrumentator 7.0.0 protobuf 4.25.3 psutil 5.9.8 py-cpuinfo 9.0.0 pyarrow 16.1.0 pyarrow-hotfix 0.6 pydantic 2.7.1 pydantic_core 2.18.2 pydeck 0.9.1 pydub 0.25.1 Pygments 2.18.0 pynvml 11.5.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2024.1 PyYAML 6.0.1 ray 2.22.0 referencing 0.35.1 regex 2024.5.15 requests 2.32.3 rich 13.7.1 rouge 1.0.1 rouge-chinese 1.0.3 rpds-py 0.18.1 ruff 0.4.4 safetensors 0.4.3 schedule 1.2.1 scipy 1.13.0 semantic-version 2.10.0 sentencepiece 0.2.0 setuptools 69.5.1 shellingham 1.5.4 shtab 1.7.1 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 socksio 1.0.0 SQLAlchemy 2.0.30 sse-starlette 2.1.0 starlette 0.37.2 streamlit 1.34.0 sympy 1.12 tenacity 8.3.0 termcolor 2.4.0 tiktoken 0.6.0 tokenizers 0.19.1 toml 0.10.2 tomlkit 0.12.0 toolz 0.12.1 torch 2.3.0 tornado 6.4 tqdm 4.66.4 transformers 4.41.2 transformers-stream-generator 0.0.5 triton 2.3.0 trl 0.8.6 typer 0.12.3 typing_extensions 4.11.0 typing-inspect 0.9.0 tyro 0.8.4 tzdata 2024.1 ujson 5.10.0 urllib3 2.2.1 uvicorn 0.29.0 uvloop 0.19.0 vllm 0.4.3 vllm-flash-attn 2.5.8.post2 vllm_nccl_cu12 2.18.1.0.4.0 watchdog 4.0.0 watchfiles 0.21.0 websockets 11.0.3 wheel 0.43.0 xformers 0.0.26.post1 xxhash 3.4.1 yarl 1.9.4

Reproduction

llamafactory-cli train --stage sft --do_train True --model_name_or_path THUDM/glm-4-9b-chat --finetuning_type lora --template glm4 --flash_attn auto --use_unsloth False --dataset_dir data --dataset xxxx --cutoff_len 1024 --learning_rate 3e-4 --num_train_epochs 2 --max_samples 100000 --per_device_train_batch_size 1 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --optim adamw_torch --packing False --report_to none --output_dir saves/THUDM/glm-4-9b-chat/395/lora/train_20240605161602 --fp16 True --lora_rank 8 --lora_alpha 16 --lora_dropout 0 --lora_target all --val_size 0.10 --evaluation_strategy steps --eval_steps 100 --per_device_eval_batch_size 1 --load_best_model_at_end True --preprocessing_num_workers 32 --plot_loss True --overwrite_cache True --ddp_timeout 180000000

Expected behavior

iTerm2 2024-06-05 16 40 24

Others

No response

hiyouga commented 4 months ago

学习率太大