allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

How to train it on token level instead of character level? #154

Closed KunlinY closed 5 years ago

KunlinY commented 5 years ago

Hi, I would like to use your model to deal with some sequential data. But those tokens do not have characters information. However, I have pre-trained embedding vectors for tokens. Can you code be trained without character embeddings? Thank you!!

matt-peters commented 5 years ago

You can use the code to train with an embedding layer instead of character inputs (if not out of the box, then with some minor modifications).