Closed ajdapretnar closed 2 years ago
This doesn't seem to be Data Sampler's fault. Replacing it with Corpus Viewer and selecting a few documents will cause the same error. The issue is with how Corpus indexes BoW features. This will be fixed in orange3-text.
Yes selecting documents through the select columns widget also causes the error, however sampling the corpus before BoW seems to avoid the issue.
The data sampler is not a problem here. It actually a topic modelling/ngram_corpus issue. Closing this issue since it must be loved in text via https://github.com/biolab/orange3-text/issues/809
What's wrong?
I suspect Data Sampler doesn't work well with bag-of-words/sparse data. Not sure. Here's the original issue: https://github.com/biolab/orange3-text/issues/809
How can we reproduce the problem?
See https://github.com/biolab/orange3-text/issues/809. Could be just text-related, but I doubt it.
What's your environment?