Open veya2ztn opened 1 year ago
Other than some formatting and refactoring issues, I love the fast-retention implementation! I was hoping to get into that. Thanks for your work!
Will this be merged?
There are some code styling issues and some things I don't understand fully. I think it's great to have its own branch for now.
Fix bug for the mode with inputs_embedding rather than inputs_ids
Add fix length seq arguement when the inputs is (addtional_token, pask_kv)
Cached the fixed retnet_rel_pos ( thus does not need generate runtimely)
add fast retention implement when the sequence length >> D**2. See
https://github.com/veya2ztn/fast_retention
5.1 I set
use_glu
defaut to false, thus consistancy to old code. 5.2 The layer norm setting in FFN seem wrong, theself.embed_dim should
beffn_dim
Anyway, I roll back to
self.ffn_layernorm = LayerNorm(ffn_dim, eps=layernorm_eps) if subln else None