Closed MojammelHossain closed 4 years ago
I am facing the same issue and I noticed that the method indeed returns the ["input_ids"]
of tgt_texts
as labels. I think I could easily fix this to return both input_ids
and attention_mask
of tgt_texts
(as decoder_...
) but I noticed the same pattern in other seq2seq models, like T5. I am not sure what's the proper solution but if it is similar to what I suggest, than I'd be happy to make a pull request.
@LysandreJik I'd be happy to hear an opinion and start working on this.
I think https://github.com/huggingface/transformers/pull/6654/ and https://github.com/huggingface/transformers/issues/6624 are related - the PR changed decoder_input_ids
to labels
. Probably the documentation should be changed but I have to get more familiar with the respective issue and PR to be sure.
Thanks for the feedback @freespirit. Hopefully, they will update the documentation as it is a little bit confusing. But what I found that the modeling_bart.py file already handles the problem. _prepare_bart_decoder_inputs() and shift_tokens_right() solving that if I am not wrong. But I think I have to go deeper for understanding which I am trying to.
Pinging @sshleifer for advice
@MojammelHossain is correct, the docs are wrong.
The correct usage is to allow _prepare_bart_decoder_inputs
to make decoder_input_ids
and decoder_attention_mask
for you. For training, you only need to pass the 3 keys returned by prepare_seq2seq_batch
.
I am trying to train a seq2seq model using BartModel. As per BartTokenizer documentation if I pass tgt_texts then it should return decoder_attention_mask and decoder_input_ids please check the attachment for clarity. But I am only getting input_ids, attention_mask and labels.