I have a code written before 0.9, and I use Transormer.decoder.decode_seq as teacher forcing to finetune BERT, after upgrading to 0.9, deocder.decode_seq() seems replaced by decoder(), am I Wrong ?
but the finetuning never converge after upgrading from version 0.8.3 to 0.9 and replacing decode_seq()withdecoder()
But if I change back to 0.8.3 with decode_seq() the code works perfectly, loss function decrease very stable, and text generate by beam search is also correct, any idea ?
I have a code written before 0.9, and I use
Transormer.decoder.decode_seq
as teacher forcing to finetune BERT, after upgrading to 0.9,deocder.decode_seq()
seems replaced bydecoder()
, am I Wrong ?but the finetuning never converge after upgrading from version 0.8.3 to 0.9 and replacing
decode_seq()
withdecoder()
But if I change back to 0.8.3 with
decode_seq()
the code works perfectly, loss function decrease very stable, and text generate by beam search is also correct, any idea ?