NVIDIA / sentiment-discovery

Unsupervised Language Modeling at scale for robust sentiment classification
Other
1.06k stars 203 forks source link

Using pretrained word embeddings #49

Open SG87 opened 5 years ago

SG87 commented 5 years ago

Is it possible to start model training (main.py) from existing word embeddings like Fasttext?

raulpuric commented 5 years ago

We have an update planned to address more advanced tokenization/data processing, but currently there's not an easy way. It's easy to load the embedding weights into the model, but it's a bit difficult to change the preprocessing to handle tokenization that's not ascii-256 character level.