@alwil, this is the ngrams branch in which I created a couple of functions, moved them to a separate file, and I did the same with the tests. If we merge this branch than I have everything in place in main to continue with the topic modeling. The cleaning seems to be working now with bigrams and trigrams as well.
@alwil, this is the ngrams branch in which I created a couple of functions, moved them to a separate file, and I did the same with the tests. If we merge this branch than I have everything in place in main to continue with the topic modeling. The cleaning seems to be working now with bigrams and trigrams as well.