cbaziotis / ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
MIT License
661 stars 90 forks source link

Failed during generate_stats.py #12

Closed JingLiJJ closed 5 years ago

JingLiJJ commented 5 years ago

computing statistics for file: text8.txt 100%|███████████████████████████████████████████████████████████████████| 1/1 [12:41<00:00, 761.71s/it]

Writing 1-grams... entries:250,982 - tokens:16,996,178 writing stats to file /home/jj301440/ekphrasis-master/ekphrasis/tools/../stats/text8/counts_1grams.txt Writing 2-grams... entries:4,136,483 - tokens:16,996,178 writing stats to file /home/jj301440/ekphrasis-master/ekphrasis/tools/../stats/text8/counts_2grams.txt Writing 3-grams... entries:10,327,876 - tokens:16,996,178 writing stats to file /home/jj301440/ekphrasis-master/ekphrasis/tools/../stats/text8/counts_3grams.txt Traceback (most recent call last): File "generate_stats.py", line 191, in write_stats(stats) File "generate_stats.py", line 147, in write_stats write_stats_to_file(filename, counter, args.mincount[int(k) - 1]) IndexError: list index out of range

Can you please tell me where the problem is?