Closed biendltb closed 5 months ago
The output of the Attention should be assigned back to x. The attn_out is not used anywhere. So this is a bug in the code.
attn_out
@biendltb oh yes, was introduced when adding the kv cache. thanks for catching this!
The output of the Attention should be assigned back to x. The
attn_out
is not used anywhere. So this is a bug in the code.