Closed sasaadi closed 5 years ago
What is your exact command to extract the last hidden layer (layer -1)? And what is your exact command to get _the outputs[0] from pytorchtransformers.BertModel() ?
To extract the last hidden layer (layer -1) from BERT, I run the extract_features.py
as follows:
python extract_features.py --input_file=tmp/input.txt --output_file=tmp/output.json --vocab_file=cased_L-12_H-768_A-12/vocab.txt --bert_config_file=cased_L-12_H-768_A-12/bert_config.json --init_checkpoint=cased_L-12_H-768_A-12/bert_model.ckpt --layers=-1 --max_seq_length=128 --batch_size=1
where the input_file
contains only one line e.g. 'here is an example .'
The output gives me the -1 hidden layer of each token separately.
To get the embeddings from the outputs[0]
:
config = BertConfig.from_pretrained('bert-base-cased')
tokenizer = BertTokenizer.from_pretrained('bert-base-cased')
model = BertModel(config)
input_ids = torch.tensor(tokenizer.encode("here is an example .")).unsqueeze(0) # Batch size 1
outputs = model(input_ids)
last_hidden_states = outputs[0]
where last_hidden_states
gives me a list of embeddings. I presume one for each token in the sentence in the same order they appear in the sentence.
Thanks
Help me, i having same problem, how to extract feature from tuned .bin file, in bert's original doc, only init ckpt checkpoint used
@sasaadi, you should load the pretrained model with model = BertModel.from_pretrained('bert-base-cased')
. In your example only the config (a dict of hyper-parameters) is loaded from the pretrained model, not the weights.
@thomwolf pytorch_transformers.BertModel.from_pretrained('bert-base-multilingual-cased', state_dict=model_state_dict) Is this solution when you load from tuned model ?
@hungph-dev-ict to load from a fine-tuned checkpoint you reference it directly: BertModel.from_pretrained('/path/to/finetuned/model')
.
@LysandreJik @thomwolf thank you very much. Now this library has just added RoBERTa, I want tune it with my corpus, do you have any solution ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Based on BERT documentation (https://github.com/google-research/bert#using-bert-to-extract-fixed-feature-vectors-like-elmo) we can extract the contextualized token embeddings of each hidden layer separately. However, when I extract the last hidden layer (layer -1), it does not match the
outputs[0]
frompytorch_transformers.BertModel()
as described here: https://huggingface.co/pytorch-transformers/model_doc/bert.html#bertmodelJust to remind that I am using the same pre-trained model (e.g.
bert-base-uncased
) and the same input (e.g. 'here is an example .') for both.