Quick question, you have a fix_projection_matrices() method to make the model output deterministic. However, if we're only utilizing layers such as SelfAttention module, it appears to be creating a new random matrix in FastAttention upon initialization. Is there a similar way to make those savable matrices?
@anklebreaker hey! if you use the SelfAttention modules by themselves, I don't believe the projection matrices are redrawn, so they should always stay the same. (Correct me if I'm wrong on that..)
Hey, thanks for making this project!
Quick question, you have a fix_projection_matrices() method to make the model output deterministic. However, if we're only utilizing layers such as SelfAttention module, it appears to be creating a new random matrix in FastAttention upon initialization. Is there a similar way to make those savable matrices?