Query related to grouping the data

utpal150894 commented 3 years ago

I have downloaded the mixed dataset from the mentioned link.So,I just want to know how you guys were separate out those pdf files into different groups,is it you did that manually?

ionathan commented 3 years ago

mixed data can be separated using the learned model. See the paper for more context.

utpal150894 commented 3 years ago

Sir,My doubt is I downloaded all the policy documents from the mentioned paper link which contains all the policy pdf's.But how could we divide those pdfs into different groups like adaptation,mitigation and non-climatic pdfs.since there is not that segregation of pdf folders provided by you.So,is it we have to manually sepate those pdfs into groups.Since in your git repo you provides a samples for those adaptation,mitigation and non-climate pdfs .its not the entire pdf's right?.The downloaded folder contains 13,301 pdfs,so its an hectic task to separate those pdf's into thier respective groups.So,I just want to know is there any simple approach that you followed to separate those pdfs?

ionathan commented 3 years ago

Indeed, you are right, its a challenging task, this is why we developed this tool: The big dataset is only for testing!

BigDataWUR / ML4ClimateAdaptationPolicy

Query related to grouping the data #3