Given the localization validators outlined in #69 and #71 , it may be helpful to extend the library with language-detection capabilities. Namely to add a validator and checker which can detect the language used in a given string along the lines:
validators.in_language(value, ..., standard = None) where:
value is the string whose contents should be checked to identify the language
standard indicates the standard language codes that are returned in response, though where None returns the Human Readable language (e.g. "American English")
checkers.is_in_language(value, languages) which returns True if value is detected to be in one of the languages contained in languages
IMPORTANT: Language detection is non-trivial in its complexity, and there are numerous other third-party libraries out there that try to do this. The key considerations are performance and accuracy, with different libraries getting different marks for value (text content) of varying length or complexity.
There are several important questions that need to be answered for this feature:
Should language detection be built in the Validator Collection, or leverage an outside library?
If leveraging an outside library, should that dependency be coupled with the Validator Collection (present in requirements.txt) or should it be considered a conditional dependency?
Should there be an "import selection tree" which tries to optimize for the language detection library that is best for a given value length AND that is available in the runtime environment?
Given the localization validators outlined in #69 and #71 , it may be helpful to extend the library with language-detection capabilities. Namely to add a validator and checker which can detect the language used in a given string along the lines:
validators.in_language(value, ..., standard = None)
where:value
is the string whose contents should be checked to identify the languagestandard
indicates the standard language codes that are returned in response, though whereNone
returns the Human Readable language (e.g. "American English")checkers.is_in_language(value, languages)
which returnsTrue
ifvalue
is detected to be in one of the languages contained inlanguages
IMPORTANT: Language detection is non-trivial in its complexity, and there are numerous other third-party libraries out there that try to do this. The key considerations are performance and accuracy, with different libraries getting different marks for
value
(text content) of varying length or complexity.