Closed LeoPAllen closed 3 years ago
I'd recommend making sure you're using the latest version of Scattertext.
If that doesn't solve your issue, please include both runnable code and a data set that replicates this term miscount along with an example of what you'd expected to see for a given term and what's happening.
Closing due to inactivity
I'm building a scattertext document based on fairly small dataset (~100 responses). Everything seems to be working, except the mention count ("Not found in any" or "Some of the N mentions:...") is clearly incorrect. Any idea how I can debug the issue? I've investigated the corpus and nothing about the metadata (corpus.get_metadata_freq_df('')) seems off. When I try to do corpus.get_term_count_df(), the method call throws back a value error: ValueError: arrays must all be same length.
The number of mentions explicitly indicated by the scattertext document does not agree with the number of mentions that that actually appear when I search for a specific term (
The data is sensitive so I'd prefer not to expose the text in my screenshot // share the code explicitly.
Environment