MIND-Lab / OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
MIT License
718 stars 102 forks source link

Can I get the original dataset? #119

Open Steveluo005 opened 8 months ago

Steveluo005 commented 8 months ago

Description

Hi author, I'm trying to browse some of the document data while I'm doing a comparison. However, I noticed that the datasets you have given are preprocessed. This may prevent users from reading. I went looking for the original dataset, such as 20newsGroup, and realized that you should have censored the dataset somewhat. Can you please provide the original dataset without processing? Thank you very much and looking forward to your reply!