Open shahrukhx01 opened 3 years ago
@lalitpagaria following are the steps involved in doing this:
Hope this would help.
@shahrukhx01 Thank for the information. Let me read them out. For first version would it possible to build cluster on list of texts. For example if Obsei fetch 200 reviews, then using these 200 texts can we generate cluster. Then tag each and every reviews based on which cluster it belongs to. Also it is possible to get multiple categories?
@lalitpagaria that's where topic modelling come into play, to assign categories based on the content of the documents. We have a separate issue for that #131
Yeah my bad. Then let's integrate Topic modelling first.
@lalitpagaria could you create a dataset of 200 posts as a csv and host it on Kaggle, I’ll take it up in the first week up August if no ones takes up these two issues
@lalitpagaria for getting document vectors we can use this
https://github.com/UKPLab/sentence-transformers