hplt-project / sacremoses

Python port of Moses tokenizer, truecaser and normalizer
MIT License
487 stars 57 forks source link

Documentation #12

Open alvations opened 5 years ago

alvations commented 5 years ago

Now that we have more than tokenization, we need some proper documentation.

alvations commented 5 years ago

Need to hurry up with the docs...

ETA: 20 May 19

Sorry for the delay for whoever is following this repo ;P

alvations commented 4 years ago

Bumping issue -_- |||

Please do this asap...

mfoglio commented 4 years ago

I'd like to try sacremoses because I am in need of a truecaser. The documentation does not link to a pretrained model. Where can I find it?

alvations commented 4 years ago

@mfoglio there are no pre-trained models in this library. They are purely rule-based ;P

mfoglio commented 4 years ago

Ok, so just to be sure I expressed myself correctly. I see in the home that you can train MosesTruecaser().

For instance: mtr = MosesTruecaser() mtr.truecase("hello my friend")

Returns:

AssertionError: 
Use Truecaser.train() to train a model.
Or use Truecaser('modefile') to load a model.

Where can I find the rule-based models?