I'm training word embeddings on the RCV1 corpus. I've generated the vocab (7MB) and co-occurence file (7.5GB) from the glove's code for RCV1 dataset. I run the following command to train
./run_glove.sh --train --root .. -coocc_file ../poincare_glove2/GloVe/cooccurrence.bin --vocab_file ../poincare_glove2/GloVe/vocab.txt --epochs 50 --workers 20 --restrict_vocab 200000 --lr 0.01 --poincare 1 --bias --size 100 --dist_func cosh-dist-sq
The scripts uses only single CPU core and do not dump anything in the logs/* file.
Could anyone provide any pointers to what I am doing wrong or something else is wrong?
Hi @alex-tifrea ,
I'm training word embeddings on the RCV1 corpus. I've generated the vocab (7MB) and co-occurence file (7.5GB) from the glove's code for RCV1 dataset. I run the following command to train
./run_glove.sh --train --root .. -coocc_file ../poincare_glove2/GloVe/cooccurrence.bin --vocab_file ../poincare_glove2/GloVe/vocab.txt --epochs 50 --workers 20 --restrict_vocab 200000 --lr 0.01 --poincare 1 --bias --size 100 --dist_func cosh-dist-sq
The scripts uses only single CPU core and do not dump anything in the logs/* file.
Could anyone provide any pointers to what I am doing wrong or something else is wrong?
Thanks