allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.77k stars 2.25k forks source link

Does allennlp contain any pre-train character embedding to use?Or empty one? #3237

Closed RichardHWD closed 4 years ago

RichardHWD commented 5 years ago

Does allennlp contain any pre-train character embedding to use?Or empty one?

brendan-ai2 commented 5 years ago

I'm a bit confused. Are you distinguishing between character and word embeddings? If you mean something like non-contextual word embeddings using solely character-level features, you could use just the Char CNN layer from ELMo. See https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md#using-elmo-as-a-pytorch-module-to-train-a-new-model. The (internal) code for that is here https://github.com/allenai/allennlp/blob/master/allennlp/modules/elmo.py#L261, I believe.

A similar class that tries to integrate into the AllenNLP APIs a bit more can be found here: https://github.com/allenai/allennlp/blob/021471a9579216b845f02b8beb8e33211df55019/allennlp/modules/seq2vec_encoders/cnn_highway_encoder.py#L13

RichardHWD commented 5 years ago

@brendan-ai2 Yes! You are so clever! The reason is that I think concatenating elmo-char-embedding with elmo output can help. How to separat?

brendan-ai2 commented 5 years ago

I'm not sure I entirely understand your question. The character layer is already included in the ELMo embeddings. It sounds like this is what you want?

DeNeutoy commented 4 years ago

Closing due to inactivity