Closed mhajiaghayi closed 1 year ago
Sorry for the late reply, I've forgotten to answer this question the first time I see. For the first question, we do not only generate an extra token for "Yes" and "No", but extract the logits of the token "Yes" and "No" to calculate the AUC. For the second question, there is a hyper-parameter you can control -- "train_on_inputs". if you set True, you will train on all given inputs, otherwise, it will only try to predict the output. Nowadays, At present, there is no explicit discourse within the community delineating which of these two approaches is more suitable for instruction tuning. You may undertake experimentation on your own dataset. For us, we follow the original setting in alpaca_lora, set train_on_inputs=True
hi there, interesting work indeed. I'm wondering where in the evaulation code you set the generate() method to only generate one extra token for 'Yes' and 'No"?
moreover, in training, when you pass input and output, the model gets trained on all given inputs or just try to predict the 'result' token?