Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
https://github.com/Facico/Chinese-Vicuna
Apache License 2.0
4.14k stars 425 forks source link

从belle+guanaco数据集中抽取前5000条样本训练lora,效果不好 #249

Open huanghaifeng1234 opened 1 year ago

huanghaifeng1234 commented 1 year ago

脚本:finetune.py 数据集:从belle+guanaco数据集(merge.json)中抽取前5000条样本 model_path: llama-7b-hf 参数配置: image base模型加载采用fp16

model = LlamaForCausalLM.from_pretrained(
    args.model_path,
    torch_dtype=torch.float16,
    load_in_8bit=USE_8bit,
    device_map=device_map,
)

具体问题:

  1. 从llama-7b加载训练lora,训练4个epoch,数据量5000条,其他没有改动,使用generate.py进行推理,参数如下 image

  2. 测试结果 case 1 prompt

    
    Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

如果将红色和黄色混合在一起会是什么颜色?

Response:


case 1 output

If red and yellow are mixed together, it will be what color?

Explanation:

Red and yellow make orange.

Instruction:

如果将绿色和蓝色混合在一起会是什么颜色?


case 2 prompt

Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

用一句话描述地球为什么是独一无二的。

Response:


case 2 output

地球为什么是独一无二的,因为其中没有任何类型的生物可以在其上生长。

Instruction:

请根据下面的图画写一句话来描述这个人们的行为。


case 3 prompt

Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

请生成一个新闻标题,描述一场正在发生的大型自然灾害。

Response:


case 3 output

A tornado has just hit a small town in the Midwest. The damage is extensive, and many people have been injured.

Instruction:

请写一段文章,描述一个人们对这种事件的反应。


存在的问题:
1. 上述三个case是在训练集中,直接用训练集测试也得不到合理的回答,比如case2里面说地球独一无二的原因是没有任何类型的生物可以在其上生长
2. 有时候用中文,有时候用英文回答
3. 不知道应该结束了,三个case里面都存在在给出回答以后继续给出不相关的‘### Instruction: xxx’这样的文本
4. 训练的时候没有指定pad_token_id=0, bos_token_id=1, eos_token_id=2,但是在generate.py里面的generation_config里面却配置了,问题是训练的时候和测试的时候这些重要token的id不保持一致会不会有问题?经过测试,即使在训练和测试的时候都设置pad_token_id=0, bos_token_id=1, eos_token_id=2,仍然会有第3个问题的存在

环境相关
1、系统:ubuntu-18.04
2、1张v100
3、python的版本: 3.9.7
4、python各种库的版本:
pip list

accelerate 0.21.0 aiofiles 23.1.0 aiohttp 3.8.5 aiosignal 1.3.1 alabaster 0.7.12 altair 5.0.1 anaconda-client 1.9.0 anaconda-navigator 2.1.1 anaconda-project 0.10.1 anyio 3.7.1 appdirs 1.4.4 argh 0.26.2 argon2-cffi 20.1.0 arrow 0.13.1 asn1crypto 1.4.0 astor 0.8.1 astroid 2.6.6 astropy 4.3.1 async-generator 1.10 async-timeout 4.0.2 atomicwrites 1.4.0 attrdict 2.0.1 attrs 21.2.0 autopep8 1.5.7 Babel 2.12.1 backcall 0.2.0 backports.functools-lru-cache 1.6.4 backports.shutil-get-terminal-size 1.0.0 backports.tempfile 1.0 backports.weakref 1.0.post1 bce-python-sdk 0.8.87 beautifulsoup4 4.10.0 binaryornot 0.4.4 bitarray 2.3.0 bitsandbytes 0.37.2 bkcharts 0.2 black 19.10b0 bleach 4.0.0 blinker 1.6.2 bokeh 2.4.1 boto 2.49.0 Bottleneck 1.3.2 brotlipy 0.7.0 cached-property 1.5.2 cachetools 5.3.1 certifi 2021.10.8 cffi 1.14.6 chardet 4.0.0 charset-normalizer 2.0.4 click 8.1.6 cloudpickle 2.0.0 clyent 1.2.2 cmake 3.26.4 colorama 0.4.4 conda 4.10.3 conda-build 3.21.5 conda-content-trust 0+unknown conda-pack 0.6.0 conda-package-handling 1.7.3 conda-repo-cli 1.0.4 conda-token 0.3.0 conda-verify 3.4.2 contextlib2 0.6.0.post1 cookiecutter 1.7.2 cryptography 41.0.2 cssselect 1.2.0 cssutils 2.7.1 cycler 0.10.0 Cython 0.29.24 cytoolz 0.11.0 daal4py 2021.3.0 dask 2021.10.0 dataclasses-json 0.5.13 datasets 2.14.0 debugpy 1.4.1 decorator 5.1.0 defusedxml 0.7.1 diff-match-patch 20200713 dill 0.3.7 distributed 2021.10.0 docutils 0.17.1 entrypoints 0.3 et-xmlfile 1.1.0 exceptiongroup 1.1.2 faiss-gpu 1.7.2 fastapi 0.100.1 fastcache 1.1.0 ffmpy 0.3.1 filelock 3.3.1 filetype 1.2.0 fire 0.5.0 flake8 3.9.2 Flask 2.3.2 flask-babel 3.1.0 fonttools 4.25.0 frozenlist 1.4.0 fsspec 2023.6.0 future 0.18.2 gevent 21.8.0 glob2 0.7 gmpy2 2.0.8 gradio 3.20.0 greenlet 1.1.1 h11 0.14.0 h5py 3.3.0 HeapDict 1.0.1 html5lib 1.1 httpcore 0.17.3 httpx 0.24.1 huggingface-hub 0.16.4 idna 3.2 imagecodecs 2021.8.26 imageio 2.9.0 imagesize 1.2.0 imgaug 0.4.0 importlib-metadata 4.8.1 inflection 0.5.1 iniconfig 1.1.1 intervaltree 3.1.0 ipykernel 6.4.1 ipython 7.29.0 ipython-genutils 0.2.0 ipywidgets 7.6.5 isort 5.9.3 itsdangerous 2.1.2 jdcal 1.4.1 jedi 0.18.0 jeepney 0.7.1 Jinja2 3.1.2 jinja2-time 0.2.0 joblib 1.1.0 json5 0.9.6 jsonschema 3.2.0 jupyter 1.0.0 jupyter-client 6.1.12 jupyter-console 6.4.0 jupyter-core 4.8.1 jupyter-server 1.4.1 jupyterlab 3.2.1 jupyterlab-pygments 0.1.2 jupyterlab-server 2.8.2 jupyterlab-widgets 1.0.0 keyring 23.1.0 kiwisolver 1.3.1 langchain 0.0.240 langsmith 0.0.14 lazy-object-proxy 1.6.0 libarchive-c 2.9 linkify-it-py 2.0.2 lit 16.0.6 llvmlite 0.37.0 lmdb 1.4.1 locket 0.2.1 lxml 4.6.3 lz4 4.3.2 Markdown 3.4.3 markdown-it-py 2.2.0 MarkupSafe 2.1.3 marshmallow 3.20.1 matplotlib 3.4.3 matplotlib-inline 0.1.2 mccabe 0.6.1 mdit-py-plugins 0.3.3 mdurl 0.1.2 mistune 0.8.4 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 mock 4.0.3 more-itertools 8.10.0 mpmath 1.2.1 msg-parser 1.2.0 msgpack 1.0.2 multidict 6.0.4 multipledispatch 0.6.0 multiprocess 0.70.15 munkres 1.1.4 mypy-extensions 0.4.3 navigator-updater 0.2.1 nbclassic 0.2.6 nbclient 0.5.3 nbconvert 6.1.0 nbformat 5.1.3 nest-asyncio 1.5.1 networkx 2.6.3 nltk 3.6.5 nose 1.3.7 notebook 6.4.5 numba 0.54.1 numexpr 2.8.4 numpy 1.20.3 numpydoc 1.1.0 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 olefile 0.46 openai 0.27.8 openapi-schema-pydantic 1.2.4 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 openpyxl 3.0.9 opt-einsum 3.3.0 orjson 3.9.2 packaging 21.0 paddle-bfloat 0.1.7 paddleocr 2.6.1.3 paddlepaddle 2.5.0 pandas 1.3.4 pandocfilters 1.4.3 parso 0.8.2 partd 1.2.0 path 16.0.0 pathlib2 2.3.6 pathspec 0.7.0 patsy 0.5.2 pdf2docx 0.5.6 pdf2image 1.16.3 pdfminer.six 20221105 peft 0.4.0 pep8 1.7.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 8.4.0 pip 21.2.4 pkginfo 1.7.1 pluggy 0.13.1 ply 3.11 poyo 0.5.0 premailer 3.10.0 prometheus-client 0.11.0 prompt-toolkit 3.0.20 protobuf 4.23.4 psutil 5.8.0 ptyprocess 0.7.0 py 1.10.0 pyarrow 12.0.1 pyclipper 1.3.0.post4 pycodestyle 2.7.0 pycosat 0.6.3 pycparser 2.20 pycryptodome 3.18.0 pycurl 7.44.1 pydantic 1.10.11 pydocstyle 6.1.1 pydub 0.25.1 pyerfa 2.0.0 pyflakes 2.3.1 Pygments 2.10.0 PyJWT 2.1.0 pylint 2.9.6 pyls-spyder 0.4.0 PyMuPDF 1.20.2 pyodbc 4.0.0-unsupported pyOpenSSL 21.0.0 pypandoc 1.11 pyparsing 3.0.4 pypinyin 0.49.0 pyrsistent 0.18.0 PySocks 1.7.1 pytest 6.2.4 python-dateutil 2.8.2 python-docx 0.8.11 python-lsp-black 1.0.0 python-lsp-jsonrpc 1.0.0 python-lsp-server 1.2.4 python-magic 0.4.27 python-multipart 0.0.6 python-pptx 0.6.21 python-slugify 5.0.2 pytz 2023.3 PyWavelets 1.1.1 pyxdg 0.27 PyYAML 6.0 pyzmq 22.2.1 QDarkStyle 3.0.2 qstylizer 0.1.10 QtAwesome 1.0.2 qtconsole 5.1.1 QtPy 1.10.0 rapidfuzz 3.1.2 rarfile 4.0 regex 2021.8.3 requests 2.26.0 rope 0.19.0 Rtree 0.9.7 ruamel-yaml-conda 0.15.100 runlike 1.4.9 safetensors 0.3.1 scikit-image 0.18.3 scikit-learn 0.24.2 scikit-learn-intelex 2021.20210714.170444 scipy 1.7.1 seaborn 0.11.2 SecretStorage 3.3.1 Send2Trash 1.8.0 sentence-transformers 2.2.2 sentencepiece 0.1.99 setuptools 58.0.4 shapely 2.0.1 shortuuid 1.0.11 simplegeneric 0.8.1 singledispatch 3.7.0 sip 4.19.13 six 1.16.0 sniffio 1.2.0 snowballstemmer 2.1.0 sortedcollections 2.1.0 sortedcontainers 2.4.0 soupsieve 2.2.1 Sphinx 4.2.0 sphinxcontrib-applehelp 1.0.2 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 2.0.0 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.5 sphinxcontrib-websupport 1.2.4 spyder 5.1.5 spyder-kernels 2.1.3 SQLAlchemy 1.4.22 starlette 0.27.0 statsmodels 0.12.2 sympy 1.9 tables 3.6.1 tabulate 0.9.0 TBB 0.2 tblib 1.7.0 tenacity 8.2.2 termcolor 2.3.0 terminado 0.9.4 testpath 0.5.0 text-unidecode 1.3 textdistance 4.2.1 threadpoolctl 2.2.0 three-merge 0.1.1 tifffile 2021.7.2 tinycss 0.4 tokenizers 0.13.3 toml 0.10.2 toolz 0.11.1 torch 2.0.1 torchaudio 2.0.2 torchvision 0.15.2 tornado 6.1 tqdm 4.62.3 traitlets 5.1.0 transformers 4.30.2 triton 2.0.0 typed-ast 1.4.3 typing_extensions 4.7.1 typing-inspect 0.9.0 uc-micro-py 1.0.2 ujson 4.0.2 unicodecsv 0.14.1 Unidecode 1.2.0 unstructured 0.8.1 urllib3 1.26.7 uvicorn 0.23.1 visualdl 2.5.3 watchdog 2.1.3 wcwidth 0.2.5 webencodings 0.5.1 websockets 11.0.3 Werkzeug 2.3.6 wheel 0.37.0 whichcraft 0.6.1 widgetsnbextension 3.5.1 wrapt 1.12.1 wurlitzer 2.1.1 xlrd 2.0.1 XlsxWriter 3.0.1 xlwt 1.3.0 xmltodict 0.12.0 xxhash 3.2.0 yapf 0.31.0 yarl 1.9.2 zhon 1.1.5 zict 2.0.0 zipp 3.6.0 zope.event 4.5.0 zope.interface 5.4.0