Closed orenpapers closed 4 years ago
They are embeddings generated from the model. (Bert -Base Model I guess. cause it has a hidden representation of 768 dim). You get 9 elements:- one contextual embedding for each word in your sequence. These values of embeddings represent some hidden features that are not easy to interpret.
So the pipeline will just return the last layer encoding of Bert? So what is the differance with a code like
input_ids = torch.tensor(bert_tokenizer.encode("Hello, my dog is cute")).unsqueeze(0)
outputs = bert_model(input_ids)
hidden_states = outputs[-1][1:] # The last hidden-state is the first element of the output tuple
layer_hidden_state = hidden_states[n_layer]
return layer_hidden_state
Also, does BERT encoding have similar traits as word2vec? e.g. similar word will be closer, France - Paris = England - London , etc?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
So the pipeline will just return the last layer encoding of Bert? So what is the differance with a code like
input_ids = torch.tensor(bert_tokenizer.encode("Hello, my dog is cute")).unsqueeze(0) outputs = bert_model(input_ids) hidden_states = outputs[-1][1:] # The last hidden-state is the first element of the output tuple layer_hidden_state = hidden_states[n_layer] return layer_hidden_state
Also, does BERT encoding have similar traits as word2vec? e.g. similar word will be closer, France - Paris = England - London , etc?
Hi @orko19, Did you understand the difference from 'hidden_states' vs. 'feature-extraction pipeline'? I'd like to understand it as well Thanks!
@merleyc I do not! Please share if you do :)
The outputs between "last_hidden_state" and "feature-extraction pipeline" are same, you can try by yourself
"feature-extraction pipeline" just helps us do some jobs from tokenize words to embedding
I am using the feature-extraction pipeline:
As an output I get a list with one element - that is a list with 9 elements - that is a list of 768 features (floats). What is the output represent? What is every element of the lists, and what is the meaning of the 768 float values? Thanks