Closed apapoudakis closed 7 months ago
Hi @apapoudakis , Thank you for your interest in our work!
input_prefix_column
set, and you also use the flags --max_prefix_length 64
and
--pad_prefix=True
, the prefix column will also be automatically tokenized and added to the tokenized input.But please verify it, for example, by decoding the input right before the call to train(..)
or evaluate(..)
.
Please let us know if you have any more questions! Cheers, Uri
Thank you for your response!
By decoding the input, I think that the input_prefix_columns adds the prefix only at the start of the input (only the first chunk). Probably for qa tasks the prefix should be added as in SLED code where the prepend_prefix argument is used for that.
Also, I would like to ask if you have tried to compute eval_loss during training, as when I'm trying to add the labels in evaluation I'm facing memory issues. Did you have any similar problems?
I think that the input_prefix_columns adds the prefix only at the start of the input (only the first chunk).
Right, that's probably correct.
Probably for qa tasks the prefix should be added as in SLED code where the prepend_prefix argument is used for that.
That's correct, we haven't tried that, but I agree that it may lead to even higher gains.
Also, I would like to ask if you have tried to compute eval_loss during training
We haven't computed the eval_loss, because we used predict_with_generate
https://github.com/abertsch72/unlimiformer/blob/main/src/run.py#L786C58-L786C79 .
So we only looked at the validation set's ROUGE and BERTScore during development, hyperparameter tuning, etc.
Let us know if you have any questions! Uri
Hello and thank you for this great work!
1) Is it possible to add the same prefix in front of every chunk? For instance, as you mention in #20 for a QA task we want to add the question before every chunk. Do we need to make any other changes to this codebase or just use the input_prefix_column argument?
2) Have you tried also using models which can process inputs longer than 4k?