Closed Xiraaa closed 5 years ago
Can you ask a more specific question, with some more context around the issue you are seeing?
I just modified a single line in the given tutorial code from this https://allennlp.org/tutorials by adding a param pretrained_file='glove.840B.300d.txt'
. I already downloaded this file but it doesn't work.. Appreciate it a lot if you could give an example about this. That would be helpful.
Can you provide a stacktrace which details the error that you're receiving?
EMBEDDING_DIM = 300
HIDDEN_DIM = 6
token_embedding = Embedding(num_embeddings=vocab.get_vocab_size('tokens'),
embedding_dim=EMBEDDING_DIM,
pretrained_file='glove.840B.300d.txt')
word_embeddings = BasicTextFieldEmbedder({"tokens": token_embedding})
lstm = PytorchSeq2SeqWrapper(torch.nn.LSTM(EMBEDDING_DIM, HIDDEN_DIM, batch_first=True))
model = LstmTagger(word_embeddings, lstm, vocab)
No error was raised after my modification. But I found no difference in the embeddings with pertained_file. I print those embeddings in the forward function. And I don't know if I have to change the indexer so I just use the original one in the tutorial code.
def forward(self,
sentence: Dict[str, torch.Tensor],
labels: torch.Tensor = None) -> Dict[str, torch.Tensor]:
mask = get_text_field_mask(sentence)
embeddings = self.word_embeddings(sentence)
print(embeddings)
encoder_out = self.encoder(embeddings, mask)
tag_logits = self.hidden2tag(encoder_out)
output = {"tag_logits": tag_logits}
if labels is not None:
self.accuracy(tag_logits, labels, mask)
output["loss"] = sequence_cross_entropy_with_logits(tag_logits, labels, mask)
return output
The issue is that we don't support loading a pretrained file from the constructor. It appears the constructor parameter named pretrained_file
is undocumented (cc @bryant1410; it looks like your script misses the fact that we put __init__
parameters in the class docstring) - it is only used to keep track of stuff for loading more embeddings at test time. If you want to actually load a pretrained embedding file, you currently need to do that by calling Embedding.from_params()
(or Embedding. _read_pretrained_embeddings_file()
to get the weight, which you then pass to the constructor). We should probably make this easier, and document the constructor parameter.
EMBEDDING_DIM = 300
HIDDEN_DIM = 6
token_embedding = Embedding.from_params(
vocab=vocab,
params=Params({'pretrained_file':'glove.840B.300d.txt',
'embedding_dim' : EMBEDDING_DIM})
)
Thanks for your help. It works! Btw Embedding. _read_pretrained_embeddings_file()
will raise no attribute error.
NOTE: I (@matt-gardner) modified the code block in here to fix an error, in case future users stumble across this issue.
(cc @bryant1410; it looks like your script misses the fact that we put
__init__
parameters in the class docstring)
Yeah. Thanks for the heads-up. Somehow PyCharm feature to show the docs of a function or class works well with that, but the lint checking fails on it.
Hi, I'm running into this error using the following from_params
. I'm using allennlp
1.0.1 and I installed allennlp_models using pip.
embeddings = Embedding.from_params(vocabulary, Params({"embedding_dim": embeddings_dimension,
"pretrained_file": file_path,
"vocab_namespace": "tokens",
"trainable": False}))
Error:
File "coref.py", line 40, in read_embeddings "trainable": False})) File "/data/home/test/cproject/allennlp/allennlp/common/from_params.py", line 533, in from_params "from_params was passed a
paramsobject that was not a
Params. This probably " allennlp.common.checks.ConfigurationError: from_params was passed a
paramsobject that was not a
Params. This probably indicates malformed parameters in a configuration file, where something that should have been a dictionary was actually a list, or something else. This happened when constructing an object of type <class 'allennlp.modules.token_embedders.embedding.Embedding'>.
Could you please take a look at this?
I am new to this... I tried the code above but it seemed the same with the one without 'pretrained_file' param