Closed jungwhank closed 4 years ago
Hi @jungwhank
for Bert2Bert, pad_token
is used as decoder_start_token_id
and the input_ids
and labels
begin with cls_token_id
([CLS]
for bert ) and end with sep_token_id
([SEP]
for bert).
For training all you need to do is
input_text = "some input text"
target_text = "some target text"
input_ids = tokenizer(input_text, add_special_tokens=True, return_tensors="pt")["input_ids"]
target_ids = tokenizer(target_text, add_special_tokens=True, return_tensors="pt")["input_ids"]
model(input_ids=input_ids, decoder_input_ids=target_ids, labels=target_ids)
The EncoderDecoderModel class takes care adding pad_token
to the decoder_input_ids
.
for inference
model.generate(input_ids, decoder_start_token_id=model.config.decoder.pad_token_id)
Hope this clarifies your question. Also pinging @patrickvonplaten for more info.
Hi, @patil-suraj
Thanks for answering.
is it same for BartForConditionalGeneration?
Actually, I wanna do kind of translation task and is it same decoder_inputs_ids
and labels
?
@patil-suraj's answer is correct! For the EncoderDecoder
framework, one should set model.config.decoder_start_token_id
to the BOS token (which in BERT's case does not exist so that we simply use CLS token).
Bart is a bit different:
model.generate(input_ids)
. input_ids
always refer to the encoder input tokens for Seq2Seq models and it depends on you if you want to add special tokens or not - this is not done automatically in the generate function.input_ids
and decoder_input_ids
and in this case the decoder_input_ids
should start with Bart's decoder_start_token_id
model.config.decoder_start_token_id
:model(input_ids, decoder_input_ids=decoder_input_ids)
@patrickvonplaten
thanks for answering!
But I have a question that Is there decoder_start_token_id
in BartConfig?
Should I just make my decoder_input_ids
start with Bart's model.config.bos_token_id
or set model.config.decoder_start_token_id
= token_id?
I think I solved the problem. Thanks
@jungwhank Great ! Consider joining the awesome HF forum , if you haven't already :) It's the best place to ask such questions. The whole community is there to help you and your questions will also help the community.
❓ Questions & Help
Details
Hello, I'm trying to using seq2seq model (such as bart and EncoderDecoderModel(bert2bert)) And I'm little bit confused about input_ids, decoder_input_ids, tgt in model inputs.
As I know in seq2seq model, decoder_input should have special token(\
or something) before the sentence and target should have special token(\or somethin) after the sentence. for example,decoder_input = <s> A B C D E
,target = A B C D E</s>
so my question is
Should I put the these special tokens in decoder_inputs_ids and tgt_ids when using seq2seq model in this library? or can i just pass the decoder_input_ids and tgt_ids without any special token ids?
Also, should I put token after target ids?
for example,
add_special_tokens=True
for encoder input_ids and put \ or \input = a b c d e, decoder_input = <s>A B C D E, target = A B C D E</s>