Open MaazBinMusa opened 1 month ago
Ran into the same thing, I think that based on the documentation, (looking at v1.5.1), for the stop_words
arg
If None, no stop words will be used. In this case, setting max_df to a higher value, such as in the range (0.7, 1.0), can automatically detect and filter stop words based on intra corpus document frequency of terms.
It might be the case that some corpuses are too small or something to automatically infer stop words. I'm just skipping that step in documents_without_stop_words
.
Describe the bug Testing on a single document results in a code crash
To Reproduce Steps to reproduce the behavior:
Reproduction example I copy pasted code from the readme.md example. The only difference was my train and test sets were not different. I just pulled 1 document from the train set and sent that as input [test_doc] to the transform function.