uetchy / Polyglot

🌏 The missing Safari extension that translates selected text into your native language.
Other
482 stars 44 forks source link

[REQ] only translate "foreign" text #29

Closed gingerbeardman closed 3 years ago

gingerbeardman commented 6 years ago

It would be great if the extension only translated "foreign" text

ideas:

screen shot 2018-08-05 at 22 35 55

devemio commented 6 years ago

To implement this we can retrieve the detected language from Google API and check with our targetLanguage. But in this case we shouldn't dispatch message showPanel with loading icon first. In other words, we first need to translate the text and then decide whether to show the panel or not. Perhaps there is another good solution to determine the language of the selected text on the extension side.

gingerbeardman commented 6 years ago

To get the language of the page we have the following:

  1. HTML lang (cf HTML spec) <html lang="ja-JP"...
  2. META charset (cf HTML spec) <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"...
  3. META locale (cf Facebook) <meta property="og:locale" content="ja_JP"...

If testing those disables and the user still wants to translate they can use the context menu.

To detect user language:

  1. navigator.language (language of browser app?)
  2. navigator.languages (users preferred languages?) (these are sometimes the same but not always)

There's also Window.onlanguagechange to take care of.

uetchy commented 6 years ago

Yes. We need to do something for this.

However, language detection is inefficient because ...

Instead of those, how about we'd offer a toolbar button (and key shortcut) for switching instant translation feature?

gingerbeardman commented 6 years ago

That doesn't solve the problem IMHO.

Maybe an option to switch off language detection for Instant Translation?

devemio commented 6 years ago

Using the API to define the language brings a lot of extra network requests.

I think that some kind of shortcut would be a good idea to switch instant translation feature.

gingerbeardman commented 6 years ago

I'm working on this one :)

gingerbeardman commented 6 years ago

Just putting some thoughts down...

If we try to simplify the goal: what do we really want to do?

Check if the language of the text/page is the same as the target language. We don't actually need to know what it is, Google Translate can figure that out.

function getPageLanguage() {
  var langHTML = document.documentElement.lang

  var langCharSet = document.characterSet

  var x = document.getElementsByTagName("META")
  for (var i = 0; i < x.length; i++) {
    if (x[i].getAttribute('property') == 'og:locale')
      var langOgLocale = x[i].getAttribute('content')
  }

  if (langHTML != null) console.log(langHTML)
  if (langCharSet != null) console.log(langCharSet)
  if (langOgLocale != null) console.log(langOgLocale)
  if (langClosest != null) console.log(langClosest)

  return
}
URL lang characterSet og:locale
https://www.github.com en UTF-8
https://www.hbo.com/about/faqs en UTF-8
https://www.bbc.co.uk/cymrufyw cy UTF-8 cy_GB
https://www.jp.playstation.com ja-JP UTF-8 ja_JP
https://ja.wikipedia.org/wiki/富士山 ja UTF-8
http://www.jah.ne.jp/~hanhan4/imasara/8lakes/ EUC-JP

Maybe it's enough to just check lang? Benefit is that it is the same format as the languages are stored in Settings.plist

There's also a possibility that elements within the page are given a lang attribute, so it's better to start at the click and go up the DOM to the closest lang attribute.

var langClosest = e.closest('[lang]').attr('lang') //closest lang attribute wrapping what was clicked

What if our logic gets it wrong? The user still sees the translation! No worries.

uetchy commented 6 years ago

Thank you for the detailed survey!

I think this feature could be shipped if the following issues were solved.

Consider filing PR for this feature! so we can test it in the real world whether it works well 👍

gingerbeardman commented 6 years ago

Agreed. This is a difficult problem to solve satisfactorily. I will give it more thought.

gingerbeardman commented 5 years ago

I was thinking it is possible to guess encoding of text using JavaScript or Obj-C (now that the extension is native) with quick initial check to see if it only contains ASCII bytes.

Background https://unicodebook.readthedocs.io/guess_encoding.html

JavaScript https://github.com/aadsm/jschardet

Obj-C (Foundation) https://developer.apple.com/documentation/foundation/nsstring/1413576-stringencodingfordata?language=objc

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

gingerbeardman commented 5 years ago

Let's still do this.

gingerbeardman commented 4 years ago

I still run into this problem.

I think perhaps my issues would mostly be solved switching off Polyglot for contents of input/textarea?

uetchy commented 4 years ago

Putting "On/off" toggle switch inside textarea (as Grammarly does) is one of our options.

Screen Shot 2020-03-18 at 19 49 38
gingerbeardman commented 4 years ago

I will look into how that works.

AntonUspishnyi commented 4 years ago

Hi there. I have Safari in English, but I want translate English to Russian? How can I do this?

gingerbeardman commented 4 years ago

@AntonUspehov did you set your preferences in the Polyglot app?

See readme > setup > step 3

gingerbeardman commented 4 years ago

This issue still annoys me every day.

How about disabling Polyglot for certain domains? For me that would be English Wikipedia for example.

uetchy commented 3 years ago

doing quick experiment. mostly working fine though.

Screen Shot 2021-01-25 at 2 17 32 PM
gingerbeardman commented 3 years ago

This does not work reliably for me?

eg. in my eBay.co.uk messages I select part of a message and that triggers Polyglot?