mistralai / mistral-finetune

Apache License 2.0
2.77k stars 232 forks source link

I have finetuned mistral-instruct-v0.3 but fine tuned model got worse. It doesn not even answer what original v0.3 can answer. #82

Open ZTAP0011 opened 4 months ago

ZTAP0011 commented 4 months ago

Python Version

python 3.10.9

Pip Freeze

annotated-types==0.7.0
anyio==4.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
astunparse==1.6.3
async-lru==2.0.4
attrs==22.2.0
Babel==2.15.0
backcall @ file:///home/ktietz/src/ci/backcall_1611930011877/work
bash_kernel==0.9.3
beautifulsoup4 @ file:///opt/conda/conda-bld/beautifulsoup4_1650462163268/work
bleach==6.1.0
brotlipy==0.7.0
certifi @ file:///croot/certifi_1671487769961/work/certifi
cffi @ file:///croot/cffi_1670423208954/work
chardet @ file:///home/builder/ci_310/chardet_1640804867535/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
comm==0.2.2
conda==23.1.0
conda-build==3.23.3
conda-content-trust @ file:///tmp/abs_5952f1c8-355c-4855-ad2e-538535021ba5h26t22e5/croots/recipe/conda-content-trust_1658126371814/work
conda-package-handling @ file:///croot/conda-package-handling_1672865015732/work
conda_package_streaming @ file:///croot/conda-package-streaming_1670508151586/work
cryptography @ file:///croot/cryptography_1677533068310/work
debugpy==1.8.2
decorator @ file:///opt/conda/conda-bld/decorator_1643638310831/work
defusedxml==0.7.1
dnspython==2.3.0
docstring_parser==0.16
exceptiongroup==1.1.1
executing @ file:///opt/conda/conda-bld/executing_1646925071911/work
expecttest==0.1.4
fastjsonschema==2.20.0
filelock @ file:///croot/filelock_1672387128942/work
fire==0.6.0
flit_core @ file:///opt/conda/conda-bld/flit-core_1644941570762/work/source/flit_core
fqdn==1.5.1
fsspec==2024.6.1
glob2 @ file:///home/linux1/recipes/ci/glob2_1610991677669/work
gmpy2 @ file:///tmp/build/80754af9/gmpy2_1645455533097/work
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
hypothesis==6.70.0
idna @ file:///croot/idna_1666125576474/work
iniconfig==2.0.0
ipykernel==6.29.5
ipython @ file:///croot/ipython_1676582224036/work
ipywidgets==8.1.3
isoduration==20.11.0
jedi @ file:///tmp/build/80754af9/jedi_1644315229345/work
Jinja2 @ file:///croot/jinja2_1666908132255/work
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-archive==3.4.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-http-over-ws==0.0.8
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.1
jupyter_server_terminals==0.5.3
jupyterlab==4.2.3
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.2
jupyterlab_widgets==3.0.11
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
MarkupSafe @ file:///opt/conda/conda-bld/markupsafe_1654597864307/work
matplotlib-inline @ file:///opt/conda/conda-bld/matplotlib-inline_1662014470464/work
mistral_common==1.2.1
mistral_inference==1.1.0
mistune==3.0.2
mkl-fft==1.3.1
mkl-random @ file:///home/builder/ci_310/mkl_random_1641843545607/work
mkl-service==2.4.0
mpmath==1.3.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nbzip==0.1.0
nest-asyncio==1.6.0
networkx==3.0
notebook==7.2.1
notebook_shim==0.2.4
numpy @ file:///croot/numpy_and_numpy_base_1672336185480/work
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.82
nvidia-nvtx-cu12==12.1.105
overrides==7.7.0
packaging==24.1
pandocfilters==1.5.1
parso @ file:///opt/conda/conda-bld/parso_1641458642106/work
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
pickleshare @ file:///tmp/build/80754af9/pickleshare_1606932040724/work
Pillow==9.4.0
pkginfo @ file:///croot/pkginfo_1666725041340/work
platformdirs==4.2.2
pluggy==1.5.0
prometheus_client==0.20.0
prompt-toolkit @ file:///croot/prompt-toolkit_1672387306916/work
psutil @ file:///opt/conda/conda-bld/psutil_1656431268089/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
pure-eval @ file:///opt/conda/conda-bld/pure_eval_1646925070566/work
pycosat @ file:///croot/pycosat_1666805502580/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==2.6.1
pydantic_core==2.16.2
Pygments @ file:///opt/conda/conda-bld/pygments_1644249106324/work
pyOpenSSL @ file:///croot/pyopenssl_1677607685877/work
PySocks @ file:///home/builder/ci_310/pysocks_1640793678128/work
pytest==8.2.2
python-dateutil==2.9.0.post0
python-etcd==0.4.5
python-json-logger==2.0.7
pytz @ file:///croot/pytz_1671697431263/work
PyYAML @ file:///croot/pyyaml_1670514731622/work
pyzmq==26.0.3
qtconsole==5.5.2
QtPy==2.4.1
referencing==0.35.1
requests==2.32.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.18.1
ruamel.yaml @ file:///croot/ruamel.yaml_1666304550667/work
ruamel.yaml.clib @ file:///croot/ruamel.yaml.clib_1666302247304/work
safetensors==0.4.3
Send2Trash==1.8.3
sentencepiece==0.1.99
simple_parsing==0.1.5
six @ file:///tmp/build/80754af9/six_1644875935023/work
sniffio==1.3.1
sortedcontainers==2.4.0
soupsieve @ file:///croot/soupsieve_1666296392845/work
stack-data @ file:///opt/conda/conda-bld/stack_data_1646927590127/work
sympy @ file:///croot/sympy_1668202399572/work
termcolor==2.4.0
terminado==0.18.1
tinycss2==1.3.0
toml @ file:///tmp/build/80754af9/toml_1616166611790/work
tomli==2.0.1
toolz @ file:///croot/toolz_1667464077321/work
torch==2.3.0
torchaudio==2.0.0
torchdata @ file:///__w/_temp/conda_build_env/conda-bld/torchdata_1678741239947/work
torchelastic==0.2.2
torchtext==0.15.0
torchvision==0.15.0
tornado==6.4.1
tqdm @ file:///opt/conda/conda-bld/tqdm_1664392687731/work
traitlets @ file:///croot/traitlets_1671143879854/work
triton==2.3.0
types-dataclasses==0.6.6
types-python-dateutil==2.9.0.20240316
typing_extensions==4.12.2
uri-template==1.3.0
urllib3 @ file:///croot/urllib3_1673575502006/work
wcwidth @ file:///Users/ktietz/demo/mc3/conda-bld/wcwidth_1629357192024/work
webcolors==24.6.0
webencodings==0.5.1
websocket-client==1.8.0
widgetsnbextension==4.0.11
xformers==0.0.26.post1
zstandard @ file:///croot/zstandard_1677013143055/work

Reproduction Steps

I fined tuned mistral model instruct v0.3 model as per the instructions mentioned here https://github.com/mistralai/mistral-finetune. Attaching my train.jsonl and eval.jsonl file.

Now when I ask my fine tuned model following question. we have following categories of products \n\n- Uncategorized,Accessories,Automotive,Baby Products,baby-products -,Bags,Beauty & Personal Care,Bottoms,boy,Bras & Tanks,Cell Phones & Accessories,Clothing,clothing -,Collectibles & Fine Art,Collections,Decor,Eco Friendly,Eco Friendly|Clothing,Electronics,Erin Recommends,Erin Recommends|Clothing,Fashion,Fitness Equipment,Fitness Equipment|Collections,Gear,girl,Health & Household,Hoodies,Hoodies & Sweatshirts,Interior,Jackets,Lawn & Garden,Men,Mens Fashion,Music,Musical Instruments,New Luma Yoga Collection,New Luma Yoga Collection|Clothing,Pants,Patio,Performance Fabrics,Performance Fabrics|Clothing,Promotions,Raincoats,Shirts,Shoes,Shorts,shorts bottoms women,Sports & Outdoors,Sweater,T-shirts,Tanks,Tees,Tees|Clothing,Tools & Home Improvement,Tops,Video Games,Watches,Watches|Collection,Winter,Women,Women Sale,Women Sale|Clothing,Womens Fashion.\n\n Please return categories from this list those matches with user need\n\n input: I want to purchase tshirts

It returns _{'followup':'sure, here are some trending tshirts for you $item', 'intent': 'browseproducts', 'metadata': {'categories': ['tshirts']}}

But when same question I ask to Mistral instruct v0.3 here it returns correct matched categories as below Based on the provided list, the categories that match your need for purchasing t-shirts are:

T-shirts Tees Tees|Clothing You may find your desired t-shirts in any of these categories. Happy shopping!

Why my fine tuned model is not returning T-Shirts, Tees, Tees|Clothing in metadata.categories.

train_and_eval_files.zip

Expected Behavior

My fine tuned model should return T-Shirts, Tees, Tees|Clothing in metadata. categories.

Additional Context

command to fine tune the model - torchrun --nproc-per-node 1 --master_port 8009 -m train example/7B.yaml

attaching ![Uploading 7B_yaml.JPG…]() 7B.yaml file

Suggested Solutions

No response

ZTAP0011 commented 4 months ago

7B_yaml

kiranshivaraju commented 3 months ago

can i know how you merged the model, because i am not able to merge the model in a 24GB VRAM, always run out of memory when i merge model instance i am using: ml g5 12x large

pandora-s-git commented 3 months ago

The model was trained with your data, and I've taken a look and I dont see any big difference, its just that after fine tuned it answers more closely to what your data suggested. I also noticed that you have a huge amount of cases where there is a single category, and this might also be playing a role. You do not have the category "Tees" at all in your dataset for example. There might be missing diversity. The output from the fine tuned is not wrong per see neither- it does seem indeed like it was properly fine tuned with your data.

Otherwise it can also depend on hyperparameters and the quantity of data you have. But your case is more related to the data itself.

Also, I noticed that your dataset has a lot of "You are a shopping assistant for the user.\n\n input: viewed products", what is "viewed products" here? 🤔

Im also confused why your expected result is to have the same output as the mistral 7b model, as usually we want to improve it?

ZTAP0011 commented 3 months ago

can i know how you merged the model, because i am not able to merge the model in a 24GB VRAM, always run out of memory when i merge model instance i am using: ml g5 12x large

@kiranshivaraju we were able to do it using https://github.com/mistralai/mistral-finetune and using vast ai 1x A100 SXM4 instance having 80 GB VRAM.