allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.76k stars 2.25k forks source link

Cannot construct BidirectionalLanguageModelTokenEmbedder #5556

Closed david-waterworth closed 2 years ago

david-waterworth commented 2 years ago

I've been adapting the model described by Louis Qin PyTorch ELMo, trained from scratch but I'n having issues creating a BidirectionalLanguageModelTokenEmbedder from the pretrained model. ie.

embedder = BidirectionalLanguageModelTokenEmbedder(
    archive_file=archive_file,
    bos_eos_tokens=("<s>", "</s>")
)

fails, it complains that Embeddings require either a size or a vocabulary. I traced it to the way the BidirectionalLanguageModelTokenEmbedder attempts to construct a new TextFieldEmbedder from the model params without a vocabulary

archive = load_archive(archive_file)

config = archive.config
dict_config = config.as_dict(quiet=True)

text_field_embedder = dict_config["model"]["text_field_embedder"]
text_field_embedder = TextFieldEmbedder.from_params(Params(text_field_embedder))

I think the last line should be

text_field_embedder = TextFieldEmbedder.from_params(vocab=self._lm.vocab,  params=Params(text_field_embedder))
david-waterworth commented 2 years ago

Also this code calls the LanguageModel delete_softmax method - but the forward method of LanguageModel attempts to compute the loss anyway and crashes.

An easy fix seems to be to surround the majority of the forward with a test for self._softmax_loss is not None

https://github.com/allenai/allennlp-models/blob/1e89d5e51cb45f3e77a48d4983bf980088334fac/allennlp_models/lm/models/language_model.py#L262-L311

But in general it appears this code doesn't work any more?

epwalsh commented 2 years ago

Hey @david-waterworth, I think both of your suggestions sound right. Care to make a PR?

github-actions[bot] commented 2 years ago

This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread 👇