AI4Bharat / indicnlp_corpus

Description Describes the IndicNLP corpus and associated datasets
156 stars 23 forks source link

No Frequency Files in Data Download #16

Open sumeet-iitg opened 1 year ago

sumeet-iitg commented 1 year ago

The Readme text-corpora section mentions

Note

The vocabulary frequency files contain the frequency of all unique tokens in the corpus. Each line contains one word along with frequency delimited by tab.

However, the download links only contain the .txt files with paragraphs and not the frequency files.