Closed hammoudhasan closed 1 month ago
This is already fixed by: https://github.com/axolotl-ai-cloud/axolotl/pull/1756
Thank you for finding this
@hammoudhasan That PR has been merged now. This can be closed.
@Tostino all working well tested it out! Thank you Tostino.
Please check that this issue hasn't been reported before.
Expected Behavior
Given a sample which has messages from system, user and assistant the messages by system and user should be masked during loss using the -100 label value which is commonly used. And only messages by assistant should be optimized for when evaluating the loss.
Current behaviour
Currently if a conversation is passed with system, user and assistant messages all messaged are masked except for last assistant message which is not masked and used for training. However, what one would expect is all assistant messages are supposed to be used in computing the penalty for the model to learn properly.
Steps to reproduce
Use sharegpt data and process it using chat_template prompt strategy. Apply axolotl.cli.preprocess and check the label values.
Config yaml
No response
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.10
axolotl branch-commit
main
Acknowledgements