feat: faster tokenization

MinishLab / model2vec

Distill a Small Static Model from any Sentence Transformer

https://minishlab.github.io/

MIT License

413 stars 18 forks source link

Closed stephantul closed 1 month ago

stephantul commented 1 month ago

A newer version of tokenizers introduced faster tokenizers, shaves about 5-10% off.

codecov[bot] commented 1 month ago

All modified and coverable lines are covered by tests :white_check_mark:

Files with missing lines	Coverage Δ
model2vec/model.py	`96.63% <100.00%> (ø)`