Closed yshi2016 closed 2 years ago
You can try to set the max_docs_per_category
parameter of produce_scattertext_explorer
to limit the number of documents stored per category. Plotting positions would still be calculated over the whole corpus. Otherwise, your best bet is to downsample the data you use to create your corpus.
Hello, I used a dataset with more than 1 million rows and the output size is 572MB with the output like below
I am wondering if this is due to the file being too large? Is there a method builtin scatter text to accommodate the size issue, or should we try sampling a subset of the original data? Thank you!