Possible Bug in Chat Template Preprocessing

hammoudhasan commented 1 month ago

Please check that this issue hasn't been reported before.

[X] I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

Given a sample which has messages from system, user and assistant the messages by system and user should be masked during loss using the -100 label value which is commonly used. And only messages by assistant should be optimized for when evaluating the loss.

Current behaviour

Currently if a conversation is passed with system, user and assistant messages all messaged are masked except for last assistant message which is not masked and used for training. However, what one would expect is all assistant messages are supposed to be used in computing the penalty for the model to learn properly.

Steps to reproduce

Use sharegpt data and process it using chat_template prompt strategy. Apply axolotl.cli.preprocess and check the label values.

Config yaml

No response

Possible solution

No response

Which Operating Systems are you using?

[X] Linux
[ ] macOS
[ ] Windows

Python Version

3.10

axolotl branch-commit

main

Acknowledgements

[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this bug has not been reported yet.
[X] I am using the latest version of axolotl.
[X] I have provided enough information for the maintainers to reproduce and diagnose the issue.

Tostino commented 1 month ago

This is already fixed by: https://github.com/axolotl-ai-cloud/axolotl/pull/1756

ehartford commented 1 month ago

Thank you for finding this

Tostino commented 1 month ago

@hammoudhasan That PR has been merged now. This can be closed.

hammoudhasan commented 1 month ago

@Tostino all working well tested it out! Thank you Tostino.

axolotl-ai-cloud / axolotl