FilipePS / Traduzir-paginas-web

Translate your page in real time using Google or Yandex
https://addons.mozilla.org/pt-BR/firefox/addon/traduzir-paginas-web/
Mozilla Public License 2.0
4.13k stars 497 forks source link

Add New Translation Engine - Lingva Translate #262

Closed ghost closed 2 years ago

ghost commented 2 years ago

Lingva Translate is an Alternative front-end for Google Translate, serving as a Free and Open Source translator with over a hundred languages available like Invidious and Nitter Frontends for YouTube & Twitter.

It Would be great if you add this as a Translation Engine as it would preserve Privacy for us without relying on Google & Exposing things.

This would be great for Degoogling.

Main Tasks :

FilipePS commented 2 years ago

Lingva Translate is not a translation service, it is just an alternative interface to Google Translate. Internally on the server they use translate.google.com to translate the texts. Basically this project has no use for my extension.

Mozilla has an extension under development called Firefox Translation, which is based on the Bergamot Project, which allows offline translation running directly in the browser. Although it currently works, it should still take a while for it to be good enough to be natively in the browser.

ghost commented 2 years ago

So, How does Translate Web Pages work ?

Is it Similar to Lingva, Could you Please explain in Detail ?

FilipePS commented 2 years ago

Before understanding how Translate Web Pages works today, you need to understand how it worked in the beginning.

In the beginning Translate Web Pages injected the Google Translation Widget into the site you want to translate. There are currently extensions that do this, but I haven't found any now to mention here. But you can see how these widgets work here How To Google Translate

Widget did all the translation work, but it had some problems:

  1. NoScript type extensions block the widget, making translation impossible.
  2. There was once an extension that did this, and it was blocked by mozilla, simply because it injects external scripts into the page.
  3. Google can track users more easily.

So I decided to switch to an approach without using widgets, and to translate the page the extension basically does this:

  1. Read the entire HTML structure of the page to get all nodes that must be translated.
  2. Get the texts from these nodes and send them to a translation service.
  3. Change the original text of the page with the translated text.

Note that in this approach no Google script is run on your browser, all Google knows is your ip and what text should be translated. It does not know (directly) which site you are translating, although the translated texts may reveal this.

Here is a translation request URL: https://translate.googleapis.com/translate_a/t?anno=3&client=te&v=1.0&format=html&sl=auto&tl=en&tk=942400.571047&q=%3Cpre%3EWillkommen%20bei%20Wikipedia%3C%2Fpre%3E When you click on this link a txt file will be downloaded, see its contents.

These requests are made directly by my extension.

ghost commented 2 years ago

Oh, Thanks @FilipePS I got it. But does reading the entire HTML Structure made locally in the Browser ?

FilipePS commented 2 years ago

Oh, Thanks @FilipePS I got it. But does reading the entire HTML Structure made locally in the Browser ?

Yes

ghost commented 2 years ago

When Reading the HTML Structure and Processing, Does it just grab those languages which needs to be translated(How ?) or sends all the Text in the HTML Structure to Google ?

ghost commented 2 years ago

Can you reply @FilipePS to this, So Old ?

FilipePS commented 2 years ago

When Reading the HTML Structure and Processing, Does it just grab those languages which needs to be translated(How ?) or sends all the Text in the HTML Structure to Google ?

It basically sends everything as you scroll through the page. No individual detection of text is done, mainly due to the limitations of native language detection of the browser.

ghost commented 2 years ago

I thought it was just sending things that needs to be translated. What if I have my Email Address or any Personal Info in a logged Webpage where the Data is being sent to Google or Yandex Violating my Privacy. Isn't that a Big problem ?

FilipePS commented 2 years ago

It is not easy to determine whether a text has personal information that should not be translated. Chrome itself translates emails, etc...

I believe that your privacy is already violated from the moment any text is sent to external servers. This is because this information allows you to determine which website you are visiting.

It is your decision to use a service that sends website content to external servers.