Closed hc09141 closed 4 years ago
hi, thanks for your interest. I'm glad you are finding the library useful. You would like to use BERT as embeddings to initialize your model, or you would like to be able to extend beyond the vocabulary of your dataset? For the latter, you could model subwords using BPE for example instead of full words. For using BERT, you could forward the pre-trained BERT model on the encoder side to produce contextual embeddings.
Thanks for your suggestions, I will look into using a pretrained BERT model to replace the embedding layer in FConvEncoder.
Hi, Thanks for the amazing library. I think I have a similar issue.
@huihuifan Do you mean that we could modify like TransformerEncoder to take the BERT encoder embeddings as the embeddings for itself?
Thank you!
Yes, or you could try to incorporate the BERT embeddings as additional input in addition to the embeddings you have
closing this for now!
We're working on a language generation task where we have relatively little data available and have been using the "Hierarchical Neural Story Generation" command line tools (thanks, they're really great!). We'd really like to work with a vocabulary larger than what our dataset contains and wonder whether utilising a pre-trained BERT model for producing word embeddings might help with this. Do you have any recommendations for going about this?