The embeddings are poor on the sentence level, but do well for base tokens.
There is a natural tree structure to my corpus that I believe stands to gain from something like on-lstm.
Do you think swapping out the embedding layer of the on-lstm with pretrained bert embeddings could be fruitful?
I have a masked LM pretrained with bert.
The embeddings are poor on the sentence level, but do well for base tokens. There is a natural tree structure to my corpus that I believe stands to gain from something like on-lstm.
Do you think swapping out the embedding layer of the on-lstm with pretrained bert embeddings could be fruitful?