rashadulrakib / short-text-clustering-enhancement

32 stars 20 forks source link

Access to datasets not preprocessed #6

Open urospet opened 3 years ago

urospet commented 3 years ago

Is it possible to have access to the datasets not preprocessed, especially tweets and google news. I was not able to find them.

rashadulrakib commented 3 years ago

Hello,

I think the datasets are not preprocessed in my github link

https://raw.githubusercontent.com/rashadulrakib/short-text-clustering-enhancement/master/data/agnewsdataraw-8000

https://raw.githubusercontent.com/rashadulrakib/short-text-clustering-enhancement/master/data/biomedical/biomedical_true_text

https://raw.githubusercontent.com/rashadulrakib/short-text-clustering-enhancement/master/data/stackoverflow/stackoverflow_true_text

On Mon, Sep 6, 2021 at 1:44 PM urospet @.***> wrote:

Is it possible to have access to the datasets not preprocessed, especially tweets and google news. I was not able to find them.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rashadulrakib/short-text-clustering-enhancement/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPBAY6EYF4JQ5X2BWDEDBLUATVXLANCNFSM5DQZAGVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

urospet commented 3 years ago

You are right, about these three datasets! Thanks! How about Google News, it seems already preprocessed