Open sam-writer opened 4 years ago
I agree, using BART
is more suitable for that when you set match_source_len=False
.
Load BART base model
bart = torch.hub.load('pytorch/fairseq', 'bart.base') #takes around two minutes
bart.eval() # enable evaluation mode
bart.cuda() # use GPU
Use it:
sentences = ['The <mask> is on the <mask> in front of <mask>.']
bart.fill_mask(sentences, topk=3, beam=10, match_source_len=False)
Gives the following results:
[[('', tensor(-1.5974e-05, device='cuda:0')),
('�The photo is on the right in front of the building.',
tensor(-0.6064, device='cuda:0')),
('�The photo is on the right in front of the house.',
tensor(-0.6113, device='cuda:0'))]]
This is definitely doable, I have a notebook that I can share with anyone interested LINK. Unclear if it is doable with a reasonable performance budget, however.
What I haven't tried yet, but would like to: Use BART for this, which should be a natural fit, because of the training procedure.