jhlau / doc2vec

Python scripts for training/testing paragraph vectors
Apache License 2.0
644 stars 192 forks source link

Pre-processing of text #3

Closed AgrawalAmey closed 7 years ago

AgrawalAmey commented 7 years ago

Is it preferable to perform stemming or stop word removal before feeding in the data while using pre-trained DBOW model?

jhlau commented 7 years ago

It's up to you. We didn't use them for our experiments and found that DBOW is still doing very well.