Fine-tune small foundational LLM on typical large enterprise use-case and compare results with pre-trained and large scale models. Instruction dataset will be generated artificially with a SOTA LLM.
1
stars
0
forks
source link
Make sure input (role=user) of chat message is not attended on (attention = -100) #7
This could help as soon as merged: https://github.com/huggingface/transformers/pull/30650