Closed KaiserWhoLearns closed 5 months ago
I think this may be more of an issue for open-instruct, but also it is quite normal for degenerate behaviors to appear during fine-tuning. Trying new datasets, new parameters, etc is normal. It's good feedback on the model, but not sure it's relevant here. And yes, we'll keep improving the models.
🐛 Describe the bug
Thanks for making the model available! I was trying to fine-tune OLMO-1b with the Open Instruct code base. However, after fine-tuning the model for a specific amount of time/instances, the model will start to generate nothing (
""
), while the un-fine-tuned model seems to behave normally during generation and achieve a reasonable amount of performance on the datasets I am using.An example of input (prompt)-gold (completion)-output for the first row is included here: https://gist.github.com/KaiserWhoLearns/fe5260b08878f2cfb7e40e42a2239afa
Is it an issue of prompt format?
I tried two different prompt formats (they differed in the space after
:
)When the model is trained with the first format, it will generate nothing (
""
), as shown in table above. When the model is trained with the second format, issue in the table occurs: it is simply repeating the colons ("::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::"
).Could it be a result of bad hyperparameters?
I tried learning rate =
2e-5
and2e-3
, behavior persists.Is the code working, after all?
I tried the exact same training code with LLAMA, which is generating as normal and shows that fine-tuning brings a performance improvement compared to the non-fine-tuned version.
Other information
Datasets I've tried: XSUM, SocialIQA, both with no luck. The training curve seems normal: https://wandb.ai/kaisersunhk/open_instruct/runs/rraheptp?nw=nwuserkaisersunhk
Code to reproduce the issue
The code is based on the Open-Instruct repository, with the modification to try different prompt formats. The command to reproduce the generation is
Versions
Python 3.11.0 absl-py==2.1.0 accelerate==0.27.2 ai2-olmo==0.2.4 aiofiles==23.2.1 aiohttp==3.9.3 aiosignal==1.3.1 alpaca_eval==0.5.3 altair==5.2.0 antlr4-python3-runtime==4.9.3 anyio==4.3.0 appdirs==1.4.4 attrs==23.2.0 auto-gptq==0.6.0 bitsandbytes==0.42.0 blinker==1.7.0 boto3==1.34.57 botocore==1.34.57 cached_path==1.6.2 cachetools==5.3.3 certifi==2024.2.2 charset-normalizer==3.3.2 click==8.1.7 cmake==3.28.3 contourpy==1.2.0 cycler==0.12.1 datasets==2.14.7 deepspeed==0.13.5 dill==0.3.7 distro==1.9.0 docker-pycreds==0.4.0 einops==0.7.0 et-xmlfile==1.1.0 evaluate==0.4.1 fastapi==0.110.0 ffmpy==0.3.2 filelock==3.13.1 fire==0.5.0 flash-attn==2.2.2 Flask==3.0.2 fonttools==4.49.0 frozenlist==1.4.1 fsspec==2023.10.0 gekko==1.0.7 gitdb==4.0.11 GitPython==3.1.42 google-api-core==2.17.1 google-auth==2.28.1 google-cloud-core==2.4.1 google-cloud-storage==2.15.0 google-crc32c==1.5.0 google-resumable-media==2.7.0 googleapis-common-protos==1.62.0 gradio==3.50.2 gradio_client==0.6.1 grpcio==1.62.0 h11==0.14.0 hjson==3.1.0 httpcore==1.0.4 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.21.4 idna==3.6 importlib_resources==6.1.2 itsdangerous==2.1.2 Jinja2==3.1.3 jmespath==1.0.1 joblib==1.3.2 jsonlines==4.0.0 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 kiwisolver==1.4.5 lit==17.0.6 Markdown==3.5.2 markdown-it-py==3.0.0 MarkupSafe==2.1.5 matplotlib==3.8.3 mdurl==0.1.2 mpmath==1.3.0 msgpack==1.0.8 multidict==6.0.5 multiprocess==0.70.15 networkx==3.2.1 ninja==1.11.1.1 nltk==3.8.1 numpy==1.26.4 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.2.10.91 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusparse-cu11==11.7.4.91 nvidia-nccl-cu11==2.14.3 nvidia-nvtx-cu11==11.7.91 omegaconf==2.3.0 openai==1.13.3 openpyxl==3.1.2 orjson==3.9.15 packaging==23.2 pandas==2.2.1 peft==0.9.0 pillow==10.2.0 protobuf==4.25.3 psutil==5.9.8 py-cpuinfo==9.0.0 pyarrow==15.0.0 pyarrow-hotfix==0.6 pyasn1==0.5.1 pyasn1-modules==0.3.0 pydantic==1.10.14 pydub==0.25.1 Pygments==2.17.2 pynvml==11.5.0 pyparsing==3.1.2 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 ray==2.9.3 referencing==0.33.0 regex==2023.12.25 requests==2.31.0 responses==0.18.0 rich==13.7.1 rouge==1.0.1 rouge-score==0.1.2 rpds-py==0.18.0 rsa==4.9 s3transfer==0.10.0 safetensors==0.4.2 scipy==1.12.0 semantic-version==2.10.0 sentencepiece==0.2.0 sentry-sdk==1.40.6 setproctitle==1.3.3 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 starlette==0.36.3 sympy==1.12 tensorboard==2.16.2 tensorboard-data-server==0.7.2 termcolor==2.4.0 tiktoken==0.6.0 tokenizers==0.15.2 toolz==0.12.1 torch==2.0.1 torchaudio==2.0.2 torchvision==0.15.2 tqdm==4.66.2 transformers==4.38.2 triton==2.0.0 typing_extensions==4.10.0 tzdata==2024.1 unidic-lite==1.0.8 urllib3==2.0.7 uvicorn==0.27.1 uvloop==0.19.0 vllm==0.2.1.post1 wandb==0.16.4 watchfiles==0.21.0 websockets==11.0.3 Werkzeug==3.0.1 xformers==0.0.22 xxhash==3.4.1 yarl==1.9.4