Open alunap opened 1 month ago
Thanks for posting.
The tokenize
method is owned by TextAnalysis, and I see that TextAnalysis 0.8 no longer supports the version dispatching on strings. One workaround is to use TextAnalysis 0.7.5.
Posted a query here about what the correct method is now.
It looks like it requires a "language" to be set, but it's not actually used: https://github.com/JuliaText/TextAnalysis.jl/blame/master/src/tokenizer.jl
I think one thing we can do is just use the tokenization from WordTokenzers
directly. I can look at making this change.
Thanks @pazzo83 for looking into this. This doesn't appear to affect MLJText tests, which are still passing (the compat for TextAnaysis includes the latest 0.8).
So maybe all that's needed is to update the docstrings.
@pazzo83 Any chance of a PR to fix this?
Sorry for the delay! I realized that we only need to update the Readme (since this is the code that is being referred to). I will be doing that.
If I run the example code (any of them) I get a failure.
will fail: