koheiw / seededlda

LDA for semisupervised topic modeling
https://koheiw.github.io/seededlda/
73 stars 16 forks source link

Take into acount frequency of seed words in data in seeded LDA #8

Closed koheiw closed 3 years ago

koheiw commented 3 years ago

A pattern "school*" not only match "school" and "schools" but also "schoolgirl" etc. that are infrequent, but we are giving the same psudo counts to all the matched words.