Open Dundalia opened 1 year ago
It depends on your specific use case. I noticed that it is better to mask when your dataset is long chat dialogues. When not masked, model responses become quite repetitive (repeating sentences from the context).
Intuitively, if your task is QA, it makes sense to mask out the context and question so that the model focuses on learning how to answer.
That's how I did it over at my qa repo
I am facing a similar dilemma. If I want to fine tune a model for summarization and I mask the chat I want to summarize, doesnt that hinder the model's ability to learn from the input or capture nuanced relationships between the summary and the input chat?
I am not understanding the conceptual usefulness of masking out the prompt. I have seen that there is a comment in scripts/prepare_alpaca.py that says:
mask_inputs: bool = False, # as in alpaca-lora
Is it recommended when fine-tuning with lora? Is there some benefit elsewhere?