Open petrbouchal opened 1 year ago
This is purely about documentation.
The documentation for step_tokenize(), under the parameter training_options says
step_tokenize()
training_options
A list of options passed to the tokenizer when it is being trained. Only applicable for engine == "tokenizers.bpe".
It also applies to udpipe as per https://www.emilhvitfeldt.com/post/textrecipes-version-0-4-0/.
udpipe
It would be great if this could be added, otherwise operating udpipe via step_tokenize() is a bit of a mystery.
Hello @petrbouchal 👋
This is very useful feedback! Please feel free to add other issues if other things are unclear!
The problem
This is purely about documentation.
The documentation for
step_tokenize()
, under the parametertraining_options
saysIt also applies to
udpipe
as per https://www.emilhvitfeldt.com/post/textrecipes-version-0-4-0/.It would be great if this could be added, otherwise operating udpipe via
step_tokenize()
is a bit of a mystery.