ddkang / loss_dropper

Apache License 2.0
51 stars 9 forks source link

Is there any way to apply this work with pretrained model( e.g. BART, T5 ) ? #4

Open ElderWanng opened 3 years ago

ElderWanng commented 3 years ago

I'm really interested in your great work. Just curious, If it is possible that combine BART with loss truncation? Cuz the vanilla LSTM with attention is kind of out-of-date.

ddkang commented 3 years ago

Hi @ElderWanng you can take a pretrained model and continue training it with loss truncation - we find it works quite well.