Open wiseyy opened 9 months ago
Why do you use Seq2Seq Collator here and not DataCollatorForLanguageModeling? I think it is for computing the loss only on the answer, given the instruction and the input, and the behaviour will be the same (should be, atleast) if you change the attention mask to attend to the tokens that follow after "###Response: ". Can you please confirm this?
Do you use the prompts as given in uniform_finetune.py to finetune over matsci-nlp.py as well?
Thank you for your help!
1、please refer to https://github.com/tloen/alpaca-lora/issues/412 2、yes and it is finetuned under low-resource setting
Hi, while finetuning Llama2 with matsci-nlp data provided in the repo, I faced the following issues.
I finetuned the model using the code given in uniform_finetune.py.
Can you please clarify the above points, and let us know if you performed additional post processing before evaluating Honeybee's output on the matsci-nlp dataset?