Closed Arjunsankarlal closed 5 years ago
@Arjunsankarlal we use an implementation from huggingface to provide BERT embeddings as a part of a model architecture. If you're just interested in running BERT on some sample input to get word vectors, I recommend you take a look at their library directly.
BertModel
is a class in their library and, in their README, the provide documentation for how to get word vectors.
@schmmd Could you please provide point to exact example? Sorry I am confused here.
Thanks Mahesh
Looks like huggingface documents this here: https://github.com/huggingface/pytorch-pretrained-BERT#bert
As BERT is included in the new release, I am trying to generate embeddings as we generate using ELMo for contextual representation.
While working with ELMo, it was easy to generate embeddings as we have specific APIs like embed_sentence(), embed_sentences(), embed_batch() etc.
In the case of BERT, I downloaded the pre-trained models from google's BERT repo, I loaded the model using,
BertModel.from_pretrained('bert-base-uncased')
and tokenized the input sentence with BertTokenizer. Now the generation of embedding part is quite confusing and I could not find any clear documentation about it.
Or If I am not heading towards the right path towards the problem, please correct me. How to use BERT and ELMo combined, in order to get the more contextualised embeddings?
Any lead would be helpful.