Open rillian opened 9 months ago
cc: @atuchin-m
The page doesn't provide any related lang tags, so the only way we could guess the language is third-party CLD3 engine. It only approximation and can't get the correct result in 100% of sites.
Chrome uses another, more advanced model and still detects the page as CN
.
The only way is to select the language manually. You also could send the bug report to the page owners.
brave://translate-internals/#detection-logs could helps to understand the logic.
Also translate could be disabled for a specific site
Thanks, those are good workarounds. And perhaps Coptic is such a minority language we don't want to bother, but I was reporting it as an annoyance I'd encountered.
Adding a lang attribute does suppress the translation controls entirely, so that's a good solution for page authors.
Description
The page translation feature mis-detects Coptic as Marathi, which is distracting and not helpful.
Steps to Reproduce
Actual result:
Translation control dropdown opens offering to translate Marathi to English: Clicking on 'English' replaces the text with question marks:
Expected result:
Translation drop-down should not open automatically when translation is not possible. Requesting a translation shouldn't mangle text in other languages.
Reproduces how often:
Easily
Brave version (brave://version info)
1.63.141 Chromium: 121.0.6167.139 (Official Build) beta (64-bit)
Version/Channel Information:
Miscellaneous Information:
Coptic and Marathi use distinct unicode code blocks, so this sort of mis-detection could probably be avoided by a simple pre-filter based on character distribution if the models aren't making correct determiniations.