Closed Saichethan closed 3 years ago
You would need to create a data structure for keeping track of every unique word encountered in the text and its previous embeddings. This is done outside of the Keras/TF model in this repository. You could:
If you go the former route, we are accepting pull requests:-)
Please refer to the beginning of Section 2 of the paper http://alanakbik.github.io/papers/naacl2019_embeddings.pdf. The embed() function they mention is equivalent to the example in https://github.com/kensho-technologies/bubs/blob/master/README.md - accepts text, outputs embeddings for each word. But the second part, the memory, is not implemented in bubs.
It requires an embed() function that produces a contextualized embedding for a given word in a sentence context (see Akbik et al. (2018)). It also requires a memory that records for each unique word all previous contextual embeddings, and a pool() operation to pool embedding vectors.
Hello, how can I get PooledContextualEmbeddings as mentioned in http://alanakbik.github.io/papers/naacl2019_embeddings.pdf