ChrisTho23 / bizztune

Fine-tune small foundational LLM on typical large enterprise use-case and compare results with pre-trained and large scale models. Instruction dataset will be generated artificially with a SOTA LLM.
1 stars 0 forks source link

Make sure input (role=user) of chat message is not attended on (attention = -100) #7

Open ChrisTho23 opened 3 weeks ago

ChrisTho23 commented 3 weeks ago

This could help as soon as merged: https://github.com/huggingface/transformers/pull/30650