Question regarding the speed up

yxuansu / NAG-BERT

[EACL'21] Non-Autoregressive with Pretrained Language Model

https://arxiv.org/abs/2102.08220

Apache License 2.0

61 stars 4 forks source link

Question regarding the speed up #2

Open allanj opened 3 years ago

allanj commented 3 years ago

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

clearloveclearlove commented 3 years ago

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

i think the reason is that model only run once, then Viterbi decode. Auto-Regressive should run n

yxuansu commented 2 years ago

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

Hello, thank you for your question. The speed up comes from the fact that NAG-BERT only do the forward computation once, as for autoregressive models they have to do forward pass n times where n is the length of output sequence.