axolotl-ai-cloud / axolotl

Go ahead and axolotl questions
https://axolotl-ai-cloud.github.io/axolotl/
Apache License 2.0
7.48k stars 808 forks source link

Possible Bug in Chat Template Preprocessing #1761

Closed hammoudhasan closed 1 month ago

hammoudhasan commented 1 month ago

Please check that this issue hasn't been reported before.

Expected Behavior

Given a sample which has messages from system, user and assistant the messages by system and user should be masked during loss using the -100 label value which is commonly used. And only messages by assistant should be optimized for when evaluating the loss.

Current behaviour

Currently if a conversation is passed with system, user and assistant messages all messaged are masked except for last assistant message which is not masked and used for training. However, what one would expect is all assistant messages are supposed to be used in computing the penalty for the model to learn properly.

Steps to reproduce

Use sharegpt data and process it using chat_template prompt strategy. Apply axolotl.cli.preprocess and check the label values.

Config yaml

No response

Possible solution

No response

Which Operating Systems are you using?

Python Version

3.10

axolotl branch-commit

main

Acknowledgements

Tostino commented 1 month ago

This is already fixed by: https://github.com/axolotl-ai-cloud/axolotl/pull/1756

ehartford commented 1 month ago

Thank you for finding this

Tostino commented 1 month ago

@hammoudhasan That PR has been merged now. This can be closed.

hammoudhasan commented 1 month ago

@Tostino all working well tested it out! Thank you Tostino.