JasonKessler / scattertext

Beautiful visualizations of how language differs among document types.
Apache License 2.0
2.23k stars 287 forks source link

Is it possible to remove stopwords without Spacy? #105

Closed gskarp closed 2 years ago

gskarp commented 2 years ago

I have a problem in using Spacy lately, and I tried to use scattertext without Spacy, as described in the relevant .py archive. Due to the large amount of words, it takes a long time and memory to load. I wonder whether it is possible to remove stopwords without Spacy?

JasonKessler commented 2 years ago

The stoplisting is independent of spaCy. If a visualization is taking a long time to load in the browser, I'd run .compact(st.AssociationCompactor(2000)), which reduces the number of terms in the visualization to 2000.