yxuansu / NAG-BERT

[EACL'21] Non-Autoregressive with Pretrained Language Model
https://arxiv.org/abs/2102.08220
Apache License 2.0
61 stars 4 forks source link

Question regarding the speed up #2

Open allanj opened 3 years ago

allanj commented 3 years ago

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

clearloveclearlove commented 3 years ago

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

i think the reason is that model only run once, then Viterbi decode. Auto-Regressive should run n

yxuansu commented 2 years ago

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

Hello, thank you for your question. The speed up comes from the fact that NAG-BERT only do the forward computation once, as for autoregressive models they have to do forward pass n times where n is the length of output sequence.