chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.21k stars 249 forks source link

Upgrade lang identifier model #375

Closed bdewilde closed 1 year ago

bdewilde commented 1 year ago

Description

Motivation and Context

I wanted to update my home-brewed language identification code to use something a bit more standard and, ideally, faster / more accurate. I also want to use floret for other purposes, so this use case brings it into textacy for later.

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist: