Closed Niyx52094 closed 2 years ago
Try changing the code at L446 of code/OpenNMT/onmt/trainer.py as follows:
if hasattr(self.model.decoder, 'state') and self.model.decoder.state is not None:
self.model.decoder.detach_state()
Very much thanks about your reply!. I'm also curious about the reason BART model using the Sequential generator defaultly instead of Copy Generator. Does this mean there is no Copy Mechanism used when I use BART Model? And may I know the reason about it? Thank you!
You're right, BART has no Copy Mechanism natively and I didn't add it in training. I actually tried to add copy loss during training but it showed no help (some attention heads act similarly to the Copy). So here I just keep the BART as is.
Hi, I'm trying to use the bart model in the model package. Is there any .yml file with bart to use ? Because currently I just simply change the
transformer-one2one-kp20k.yml
to bart but there is a debug shows after I change the generator in bart asCopygenrator
(well,the original generator in bart is Sequential, but I want to use copy_attn so I changed the generator to copygenerator)