vineetjohn / linguistic-style-transfer

Neural network parametrized objective to disentangle and transfer style and content in text
Apache License 2.0
138 stars 33 forks source link

[$WORD_EMBEDDINGS_PATH] when pre-training embedding model #70

Closed hyeseonko closed 5 years ago

hyeseonko commented 5 years ago

Hi,

I wanted to pre-train word embedding models like these as you wrote in README.

./scripts/run_word_vector_training.sh \ --text-file-path ${TRAINING_TEXT_FILE_PATH} \ --model-file-path ${WORD_EMBEDDINGS_PATH}

but, why do you differentiate the word_embeddings_path and validation__word_embedding_path ?

Then, my question is

Thank you for releasing your code :)

vineetjohn commented 5 years ago

Here's what each of those bash variables mean:

To answer your questions

Can I use glove.6B.100d.txt for pre-training word embedding step?

No, not directly. The model expects word2vec embeddings. If you have a way to convert the glove embeddings into corresponding word2vec embeddings, then you can use the converted embeddings.

If not, which embedding files should I use in this case?

If you want to replicate the paper exactly, follow the steps listed here.