facebookresearch / cc_net

Tools to download and cleanup Common Crawl data
MIT License
972 stars 142 forks source link

how to only compute the perplexity of each paragraph using your language model with local data? #54

Open rongjingyue423 opened 1 year ago

rongjingyue423 commented 1 year ago

how to only compute the perplexity of each paragraph using your language model with local data? i don't want to use -d to dump data? I have downloaded the Chinese model use make lang=zh dl_lm

monifeng commented 1 year ago

Hi, did you finish it?I also want to do something like this, maybe we can discuss this and exchange ideas :)