Closed tejuafonja closed 8 years ago
Sure, let me first provide some high level info. Users who simply want some word vectors to use in a generic application can download pretrained word vectors as per the README. Users that eventually want to train on their own corpus are encouraged to first train on data that we provide as per the readme, so that they can go through the motions.
In the process on training on our data, you'll generate vocab.txt and vector.txt. So if you follow the readme under https://github.com/stanfordnlp/GloVe#train-word-vectors-on-a-new-corpus, you should be set up!
Unfortunately, that's out of the scope of what we can do to help you. Best of luck!
Thanks.
@Russell91 Hi, I am just trying to understand what vectors.txt and vocab.txt contain. First few records of vocab.txt file contain the following. the 1061396 of 593677 and 416629 one 411764 in 372201 a 325873
What does the number corresponding to each word mean?
For instance, What does 1061396 corresponding to the word 'the' mean?
Thanks!
@akshay-vaidya : It's the count of occurrences of each (unique) word in the document.
Thanks @drawar
@akshay-vaidya ,@Russell91 Hi Akshay,
Could you please elaborate the steps for generating vocab.txt and vector.txt file on Windows. I am unable to follow the steps describe on README.
Thanks
I am having issue understanding where to get vocab.txt and vector.txt files. I am relatively new to this, please help. Thanks.