CopticScriptorium / coptic-nlp

Coptic NLP pipeline page and utilities
Apache License 2.0
14 stars 5 forks source link

Integrate binding model into stacked_tokenizer #24

Closed lgessler closed 5 years ago

lgessler commented 5 years ago

To test the PR:

cd lib
# train the model
python binder.py xgboost --train_list=onno+ephraim+victor+cyrus
python stacked_tokenizer.py ../eval/plain/aug_bind_uddev.txt -d