huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.49k stars 26.89k forks source link

Finetuning BART on a multi-input sequence to sequence task #13661

Closed nbravulapalli closed 3 years ago

nbravulapalli commented 3 years ago

I finetuned bart-base on a sequence to sequence task and I have the following questions:

a) Currently I structured the input and output for the bart model in "t5-style" by adding prefixes in front of each piece of input. For bart how should I give multiple inputs (or train it to return multiple outputs) to the model (is there special token to separate inputs, should I continue the t5-style prefixes, etc.)? Also, how would I do this for gpt-2/gpt-neo?

b) When finetuned with prefixes, the target data is formatted with "output: ......", however, the finetuned-bart returns "outputoutput: ......". Why is this repetiton occurring? Also, does the Bart tokenizer automatically add the eos token?

c) Also does the trainer api automaticallly handle adjust_logits_during_generation and decoder_start_token_id as discussed in this post?

Could @patil-suraj or @patrickvonplaten help with this? This is my first project training an nlp model, and I would really appreciate any information you can offer regarding my questions.

patil-suraj commented 3 years ago

Hi there! It would be better if you post this on the forum instead since this is much general question and not an issue. You can tag me on the forum using @valhalla :)

Use issues to report bugs or for feature requests. Thanks!

nbravulapalli commented 3 years ago

Thanks for your reply! I will close this issue and repost it on the forum.

EDIT:

@patil-suraj I have posted this on the huggingface forum here. Can you please take a look at it? Thank you!