Pillars-Creation / ChatGLM-RLHF-LoRA-RM-PPO

ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现
Apache License 2.0
78 stars 8 forks source link

[BUG/Help] 'python finetune_ppo.py'启用fp16参数会遇到ValueError: Attempting to unscale FP16 gradients. #5

Closed mockyd closed 11 months ago

mockyd commented 11 months ago

Is there an existing issue for this?

Current Behavior

具体情况如题,我搜到了其他项目的这个issue:https://github.com/ymcui/Chinese-LLaMA-Alpaca/issues/310 我想可能是我的peft版本的问题,所以想请问一下大佬之前用的peft版本是多少?

Expected Behavior

No response

Steps To Reproduce

1.environment: Package Version


accelerate 0.24.1 aiofiles 23.2.1 aiohttp 3.8.6 aiosignal 1.3.1 altair 5.1.2 annotated-types 0.6.0 anyio 3.7.1 asttokens 2.0.5 astunparse 1.6.3 async-timeout 4.0.3 attrs 23.1.0 backcall 0.2.0 beautifulsoup4 4.11.1 brotlipy 0.7.0 certifi 2022.9.24 cffi 1.15.1 chardet 4.0.0 charset-normalizer 2.0.4 click 8.1.7 colorama 0.4.6 conda-package-handling 1.9.0 contourpy 1.0.7 cpm-kernels 1.0.11 cryptography 38.0.1 cycler 0.11.0 datasets 2.15.0 decorator 5.1.1 dill 0.3.7 dnspython 2.2.1 docstring-parser 0.15 exceptiongroup 1.0.4 executing 0.8.3 expecttest 0.1.4 fastapi 0.104.1 ffmpy 0.3.1 filelock 3.6.0 flit_core 3.6.0 fonttools 4.39.3 frozenlist 1.4.0 fsspec 2023.10.0 future 0.18.2 glob2 0.7 gradio 4.3.0 gradio_client 0.7.0 h11 0.14.0 httpcore 1.0.2 httpx 0.25.1 huggingface-hub 0.19.3 hypothesis 6.61.0 idna 3.4 importlib-resources 6.1.1 ipython 8.7.0 jedi 0.18.1 jieba 0.42.1 Jinja2 2.11.3 joblib 1.3.2 jsonschema 4.19.2 jsonschema-specifications 2023.11.1 kiwisolver 1.4.4 latex2mathml 3.76.0 libarchive-c 2.9 Markdown 3.5.1 markdown-it-py 3.0.0 MarkupSafe 2.0.1 matplotlib 3.7.1 matplotlib-inline 0.1.6 mdtex2html 1.2.0 mdurl 0.1.2 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 mpmath 1.2.1 multidict 6.0.4 multiprocess 0.70.15 nltk 3.8.1 numpy 1.26.2 opencv-python 4.7.0.72 orjson 3.9.10 packaging 23.1 pandas 2.1.3 parso 0.8.3 peft 0.6.2 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.3.0 pip 22.3.1 pkginfo 1.8.3 pluggy 1.0.0 prompt-toolkit 3.0.20 protobuf 4.25.1 psutil 5.9.0 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 14.0.1 pyarrow-hotfix 0.5 pycosat 0.6.4 pycparser 2.21 pydantic 2.5.1 pydantic_core 2.14.3 pydub 0.25.1 Pygments 2.16.1 pyOpenSSL 22.0.0 pyparsing 3.0.9 PySocks 1.7.1 python-dateutil 2.8.2 python-etcd 0.4.5 python-multipart 0.0.6 pytz 2022.1 PyYAML 6.0 referencing 0.31.0 regex 2023.10.3 requests 2.28.1 rich 13.7.0 rouge-chinese 1.0.3 rpds-py 0.12.0 ruamel.yaml 0.17.21 ruamel.yaml.clib 0.2.6 safetensors 0.4.0 scipy 1.10.1 semantic-version 2.10.0 sentencepiece 0.1.99 setuptools 65.5.0 shellingham 1.5.4 shtab 1.6.4 six 1.16.0 sniffio 1.3.0 sortedcontainers 2.4.0 soupsieve 2.3.2.post1 stack-data 0.2.0 starlette 0.27.0 sympy 1.11.1 tokenizers 0.13.3 toml 0.10.2 tomlkit 0.12.0 toolz 0.12.0 torch 1.13.1 torchelastic 0.2.2 torchsummary 1.5.1 torchtext 0.14.1 torchvision 0.14.1 tqdm 4.64.1 traitlets 5.7.1 transformers 4.30.0 trl 0.7.4 typer 0.9.0 types-dataclasses 0.6.6 typing_extensions 4.8.0 tyro 0.5.14 tzdata 2023.3 urllib3 1.26.13 uvicorn 0.24.0.post1 wcwidth 0.2.5 websockets 11.0.3 wheel 0.37.1 xxhash 3.4.1 yarl 1.9.2 2.Run CUDA_VISIBLE_DEVICES=3 python finetune_ppo.py

Environment

- OS:Ubuntu 20.04
- Python:3.10
- Transformers:4.30
- PyTorch:1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

mockyd commented 11 months ago

我把peft从0.6.2回退到0.5.0就可以了。。