DataKind-BLR / PrathamBooks-Sprint-2018

Code and documentation for the collaboration with PrathamBooks during Sprint' 2018
MIT License
4 stars 7 forks source link

Processing Approach #22

Open abmath opened 6 years ago

abmath commented 6 years ago

(1) Take a list of existing tags and try to identify the type of word. Initial hypothesis that tags would have majority of adjectives, common nouns & Verbs (2) See distribution of tags into categories (3) Try following text processing approaches - tf-idf, cosine similarity, cosine similarity with POS tagging and build a supervised classification to categories to see which approach works best to be able to categorise