Closed r12a closed 2 years ago
If the fallback language isn't supported at all, I think the recognizer should return null
as the prediction result.
This aside, I'd expect developers to check language support with queryHandwritingRecognizerSupport
, decide if the support meets their requirement, and provide a reasonable hint. Providing an unsupported language as a hint means this hint is invalid, and the recognizer can ignore it and do whichever it finds most appropriate.
Not providing a hint means the recognizer should an appropriate language among it's supported list (it's okay to be a wrong one, it just results in the recognition result being unusable). Considering language
Created a PR for fallback (if no language is supported): https://github.com/WICG/handwriting-recognition/pull/8
- If there's no dedicated models for that script, the recognizer falls back to the macro language (
zh-Hans
falls back tozh
).
Is there a terminology issue here? zh is indeed a macrolanguage, but not every language tag that has a script subtag starts with a macrolanguage. They do, however, all start with a 'language subtag'. Perhaps the sentence should say that it falls back to the language subtag, or even better removes subtags until a match is found. That way zh-Hant-HK could fall back to a generic traditional chinese recogniser, or a generic chinese one.
However, any such fallback will probably work only if the recogniser associates the language tag implicitly with a given script, because the process of recognising is very much tied to the orthography used. This is ok for many languages, and indeed BCP47 rules actively encourage association of a default script with bare language tags, but not for all. For example, if az-arab falls back to az, this is of no help if the Azeri recogniser only works with cyrillic.
Indeed a terminology issue on my side. :)
Updated to "remove the last subtag until there is a match, ...", since it's a straight forward rule. https://github.com/WICG/handwriting-recognition/commit/0384cbd4c8f1f973fe54075e8861df32de3af9ab
As for "associates the language tag implicitly with a given script". I think this is the case for most recognizer implemtations. I can't speak for all implementations, but the one we have at Google will attempt to recognize scripts that make sense to appear in that language.
Closing. The terminologies have been updated.
https://github.com/WICG/handwriting-recognition/blob/main/explainer.md#recognition-hints
What if the browser default language is set to something that the recogniser cannot deal with?
I suspect that, like for hyphens in CSS, it must be required that a language be selected for this to work, and the selection would be from a list of languages supported by the recogniser.