Open armintabari opened 8 years ago
Well we need to store words as integers rather than strings in some data structures. The particular mapping is not crucial, but having some mapping is important. I believe we sort a list of words and read off the word_id as the position of each word in the list.
Is the list you mention the vocab.txt? How can we find/use the list?
In co-occurrence file, each line is in the format of (word1_ID word2_ID count). What are these IDs? Are they the order of words in vocab count file (vocab.txt)?