insightindustry / validator-collection

Python library of 60+ commonly-used validator functions
http://validator-collection.readthedocs.io/en/latest/index.html
MIT License
127 stars 12 forks source link

Support Language Detection #72

Open insightindustry opened 3 years ago

insightindustry commented 3 years ago

Given the localization validators outlined in #69 and #71 , it may be helpful to extend the library with language-detection capabilities. Namely to add a validator and checker which can detect the language used in a given string along the lines:

IMPORTANT: Language detection is non-trivial in its complexity, and there are numerous other third-party libraries out there that try to do this. The key considerations are performance and accuracy, with different libraries getting different marks for value (text content) of varying length or complexity.

insightindustry commented 3 years ago

There are several important questions that need to be answered for this feature:

  1. Should language detection be built in the Validator Collection, or leverage an outside library?
  2. If leveraging an outside library, should that dependency be coupled with the Validator Collection (present in requirements.txt) or should it be considered a conditional dependency?
  3. Should there be an "import selection tree" which tries to optimize for the language detection library that is best for a given value length AND that is available in the runtime environment?