Closed ds-hwang closed 1 week ago
In MultiheadAttention.extend_step, logit_bias was hardcoded to have a length of 1. This PR modified it to support multi-step inputs. This change also makes extend_step more aligned with forward, reducing the overall code complexity.
Could you take a look? From 901
Thank you for review!
In MultiheadAttention.extend_step, logit_bias was hardcoded to have a length of 1. This PR modified it to support multi-step inputs. This change also makes extend_step more aligned with forward, reducing the overall code complexity.