Closed bionicles closed 1 year ago
Hi @bionicles this looks like a really interesting paper/concept. It seems like it could be a fit as a TFA Layer.
You mentioned you may be interested in contributing what would that be dependent upon?
Just time, and my ability to understand the paper/code... This one also looks good: https://arxiv.org/pdf/1907.09720v1.pdf I’m definitely interested to contribute to TFA! Maybe some simpler stuff we already have working would be better in the short term
I would wanna try understanding the paper and try implementing this layer if no one is working on this issue.
Any update on this?
Sorry I was busy with some projects and could not finish the work on this. If you are looking to contribute to this go forward, it would be really helpful :D . If not I might try to pull out some time and look back to implementing it.
Likewise, I can’t do this now, but it would be cool!
Hi, @bionicles @sayoojbk @Squadrick I want to take up this issue if it's okay. Thanks
Hi, @bionicles @sayoojbk @Squadrick I want to take up this issue if it's okay. Thanks
Sure @gaurav-singh1998 you can move forward with this. If need any help ping anyone of us on gitter.
Hello @sayoojbk as I am new in this repository I may take some time to get acquainted with the code base and finally come up with a PR. Is it okay?
Yea take your time! If need any help ping on the official gitter channel for SIG-addons :P
TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision: TensorFlow Addons Wind Down
Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA: Keras Keras-CV Keras-NLP
System information
Describe the feature and the current behavior/state. This paper introduces a structured memory which can be easily integrated into a neural network. The memory is very large by design and therefore significantly increases the capacity of the architecture, by up to a billion parameters with a negligible computational overhead. Its design and access pattern is based on product keys, which enable fast and exact nearest neighbor search. The ability to increase the number of parameters while keeping the same computational budget lets the overall system strike a better trade-off between prediction accuracy and computation efficiency both at training and test time. This memory layer allows us to tackle very large scale language modeling tasks. In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture. In particular, we found that a memory augmented model with only 12 layers outperforms a baseline transformer model with 24 layers, while being twice faster at inference time. We release our code for reproducibility purposes. https://arxiv.org/pdf/1907.05242v1.pdf https://github.com/facebookresearch/XLM/blob/master/src/model/memory/memory.py Will this change the current api? How? yeah, new layer with lots of memory for the model Who will benefit with this feature? people who use TFA + Keras api Any Other info. i like pie