Closed PhilippMarquardt closed 4 years ago
@PhilippMarquardt it is possible to use Linformer type attention here! https://github.com/lucidrains/linear-attention-transformer/ Which has reversibility built in!
@PhilippMarquardt I threw in reversibility in the most recent release, hope you find it helpful!
This is a feauture request rather than an issue. Since the complexity of this attention is still quite high it would be nice to have the option of making the network reversible like some of your other implementations. It would enable users to choose a bigger k or batch size. Otherwise thanks for all your amazing work!