bstewart / stm

An R Package for the Structural Topic Model
Other
399 stars 98 forks source link

Custom Stopwords #221

Open kmtimm opened 4 years ago

kmtimm commented 4 years ago

I am trying to remove some frequently occurring words in my corpus using stm's built-in textProcessor. My code ran without any errors, but the words I specified were not removed. Does anyone know if I need to do something different with the list of words in my code? Neither the stm description or the vignette have any example code to show how to specify the words on the custom stop words list. Thanks!

processed <- textProcessor(data$Text, metadata = data, lowercase = TRUE, removestopwords = TRUE, removenumbers = TRUE, removepunctuation = TRUE, stem = FALSE, wordLengths = c(3, Inf), sparselevel = 1, language = "en", verbose = TRUE, onlycharacter = FALSE, striphtml = TRUE, customstopwords = "climate, change, national, assessment, report, said", v1 = FALSE)

ZackCzq commented 4 years ago

I think you need to list stopwords in the following form: customstopwords = "climate", "change", "national"...

Might be too late to see this but hope it would help :-)