The standard in ParaCrawl is that the user can specify a tokenizer rather than it being hard-coded into the packages. This way some user can handle Chinese etc.
I suspect this will be moot once integrated into bitextor since it should be tokenizing for you anyway.
The standard in ParaCrawl is that the user can specify a tokenizer rather than it being hard-coded into the packages. This way some user can handle Chinese etc.
I suspect this will be moot once integrated into bitextor since it should be tokenizing for you anyway.