bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
https://bnosac.github.io/udpipe/en
Mozilla Public License 2.0
209 stars 33 forks source link

add dbscan together with word2vec/doc2vec/paragraph2vec #83

Open jwijffels opened 3 years ago

jwijffels commented 3 years ago

doc2vec should be on cran in 2 days, so paragraph2vec (PV-DM/PV-DBOW) will be easy now So we can now easily do https://github.com/ddangelov/Top2Vec and combine it with udpipe / sentencepiece / dbscan / uwot / word2vec / tfidf to obtain semantic topic detection. Even https://github.com/MaartenGr/BERTopic#ctfidf is in reach using golgotha

jwijffels commented 3 years ago

or maybe alongside https://github.com/gagolews/genieclust