hxebolax / TranslateAdvanced

Traductor Avanzado para NVDA es un complemento que permite traducir textos usando Google Translate, DeepL, LibreTranslate y Microsoft Translator. Ofrece traducción simultánea, historial y gestión de claves API. Fácil de configurar y usar, con teclas rápidas y opciones personalizables en el menú de NVDA.
GNU General Public License v2.0
4 stars 5 forks source link

Agregar detección de idioma local #2

Closed fastfinge closed 2 months ago

fastfinge commented 2 months ago

Nota: Este texto fue traducido del inglés al español por Deep L. El texto en inglés se incluye al final. Cuando la traducción está habilitada, especialmente al navegar por las interfaces de usuario, parte del texto ya está en el idioma de destino y no es necesario traducirlo. Sin embargo, el complemento seguirá enviando este texto al servicio de traducción de todos modos. Esto puede costar dinero y ser lento. Además, Microsoft Translator no puede detectar idiomas automáticamente. Por estos motivos, sería útil agregar la detección del idioma local al complemento. De esa manera, si los idiomas de origen y de destino son los mismos, NVDA no enviará el texto a través de Internet. Además, Microsoft Translator funcionaría sin establecer un idioma de origen. Por último, mejoraría la privacidad y seguridad del complemento. Si Windows le envía una notificación inesperada o aparece un cuadro de diálogo con información privada, es probable que ya esté en su idioma. Si el complemento detecta el idioma del texto localmente, este texto no se enviará a través de Internet. La mejor manera de hacer esto es: https://github.com/pemistahl/lingua-py

También sería una buena idea permitir que los usuarios con poca memoria establezcan una precisión baja o desactiven por completo la detección del idioma local en la configuración del complemento. Sin embargo, para los usuarios con suficiente memoria y CPU, pueden ahorrar dinero y ancho de banda, sin dejar de ser un poco más privados. El texto en inglés es el siguiente.

When translation is enabled, especially when navigating user interfaces, some text is already in the target language, and does not need to be translated. However, the addon will still send this text to the translation service anyway. This can cost money, and be slow. As well, Microsoft Translator cannot auto-detect languages. For these reasons, it would be useful to add local language detection to the addon. That way, if the source and target languages are the same, NVDA would not send the text over the internet. As well, Microsoft Translator would work without setting a source language. Lastly, it would make the privacy and security of the addon better. If Windows sends you an unexpected notification, or a dialogue pops up with private information, it is likely to be in your language already. If the addon detects the language of text locally, this text would not be sent over the internet. The best way to do this is: https://github.com/pemistahl/lingua-py

It would also be a good idea to let users with low memory set low accuracy, or turn off local language detection entirely, in the addon configuration. However, for users with enough memory and CPU, they can save money and bandwidth, while remaining slightly more private.

hxebolax commented 2 months ago

Hello, thank you for your suggestion. This topic is being investigated, but let me mention that one of my principles for this Hera plugin is not to include external libraries to avoid increasing the size of the plugin. Well, there are many libraries for detecting languages, but they all have several issues.

  1. Obtaining the language is not immediate; there is an insignificant loss of time. However, in a simultaneous translator, adding the need to send the text to external services increases the response time.

  2. Language detection is not perfect and could be more harmful than beneficial. If I pass a text through a language detector and it makes a mistake, we will send the text with incorrect source language parameters to the translator. As a result, we might receive the original text back and waste time.

  3. The library example you proposed is already known to me, but implementing it would increase the plugin's size by 80 megabytes. Unfortunately, searching for smaller libraries also means a loss of detection quality. Adding to the fact that not all libraries include the most spoken languages. When using a plugin like this, the user must be clear that the data is being sent to external services, so they need to be careful with the information they send. I make this very clear in the documentation, stating that I disclaim any responsibility. It is also the user's responsibility to use plugins like this appropriately, employing them when necessary and ensuring that they do not send sensitive data. Finally, I want to mention that this plugin will offer the possibility of using Bergamot's offline translation models, which are the ones used offline by Mozilla Firefox's translator, in the future. https://browser.mt/ These models will comply with privacy standards and have excellent performance on computers with low resources. Likewise, the topic of detecting the language, as I mentioned at the beginning, is being investigated. However, we are trying to use components that are usually available in Windows, such as MsSpellCheckingFacility.dll and other language handling libraries incorporated by Windows, which facilitate this task by being natively included in most Windows systems.
fastfinge commented 2 months ago

Thank you for your response to my idea! The primary reason I was thinking about this is because DeepL can cost a lot of money, and I would like to avoid sending unnecessary text to the API.

I think using the language detection libraries already in Windows is a much better idea. I didn't know that we could access them from Python. I look forward to learning from your source code if you make it work!

I'm also excited about the offline translation models. Personally, I use DeepL because it's the only translator I trust. I only speak English, so I can't check how well it translated. But I find that people understand me best when I'm using DeepL. However, perhaps Bergamot would be good enough for reading. It's always good to have more options!