uf-hobi-informatics-lab / ClinicalTransformerNER

a library for named entity recognition developed by UF HOBI NLP lab featuring SOTA algorithms
MIT License
142 stars 28 forks source link

change tokenizer to fast tokenization #6

Closed bugface closed 3 years ago

bugface commented 3 years ago

release by HuggingFace (https://huggingface.co/docs/tokenizers/python/latest/), the rust-based tokenizer is faster than the current tokenizer. We need to update the old tokenizer to the newest ones.

bugface commented 3 years ago

After testing, the fast tokenizer will not bring significant speed improvement. We will skip this issue.