alpheios-project / alpheios-core

Alpheios Core Javascript Packages and Libraries
15 stars 2 forks source link

Double click acting funny in Latin Library #315

Closed monzug closed 4 years ago

monzug commented 4 years ago

FF and Chrome in Alpheios Reading Tools 3.1.0 build qa.20200507338 Alpheios Components 3.3.0-qa.20200507327

as soon as I installed the latest build, I noticed that double click was acting funny in Latin Library (not Loeb Classics). in FF, when I double click on a word it does actually select the previous or the following one (sometimes the word itself), depending on where the mouse is placed. see screenshot

FF-latin-library

in Chrome, it does generate the correct pop-up when double clicking on a word but the highlighted word is truncated. see screenshot

chrome-latin-library

Do you think it could be related to the mouse move accuracy?

balmas commented 4 years ago

@monzug can you double check if you enable and disable mousemove, does the problem go away? I was experiencing something similar and I realized that the problem was that mousemove AND double-click seemed to be enabled at the same time. I'm not sure how that scenario occurs (and if it still does, it's a bug) but it would help to know if that is what you are experiencing too. Thanks!

monzug commented 4 years ago

I haven't changed any default options with the latest build though.

On Tue, May 12, 2020 at 9:19 AM Bridget Almas notifications@github.com wrote:

@monzug https://github.com/monzug can you double check if you enable and disable mousemove, does the problem go away? I was experiencing something similar and I realized that the problem was that mousemove AND double-click seemed to be enabled at the same time. I'm not sure how that scenario occurs (and if it still does, it's a bug) but it would help to know if that is what you are experiencing too. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alpheios-project/alpheios-core/issues/315#issuecomment-627339503, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJ32UOMSPG3IKIZKQHWRWUDRRFEF3ANCNFSM4M62FFPA .

balmas commented 4 years ago

But it could be that the initialization of the setting isn't working quite right.

monzug commented 4 years ago

ok

On Tue, May 12, 2020 at 9:23 AM Bridget Almas notifications@github.com wrote:

But it could be that the initialization of the setting isn't working quite right.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alpheios-project/alpheios-core/issues/315#issuecomment-627341920, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJ32UOO2FX5O2SJ4MD2RI6TRRFETXANCNFSM4M62FFPA .

monzug commented 4 years ago

nope, it hasn't changed the behavior.

balmas commented 4 years ago

ok, thanks for checking. Assigning to @kirlat to investigate

kirlat commented 4 years ago

I think this text with unusual formatting reveals a possible bug in our code. When we do a replacement of special characters in HTML during a selection, we sometimes change the length of a text. As a result, selection shifts.

@monzug, to verify that this is the case, can you please do the following:

  1. Open https://www.thelatinlibrary.com/gellius/gellius13.shtml in FF.
  2. Do a double click within paragraph that starts with V: "V. Item alio in loco idem significat". Selection should work correctly.
  3. Do a double click within some other long paragraph like the first one: "I. Inquisitio verborum istorum ...". In that case selection should shift. If your tests will have the same results as described above, this confirms the issue and I will look for a character, or group of characters, that are causing this. Thanks!
monzug commented 4 years ago

yes, I can confirm your findings. it works well in Item alio in loco idem significat but not in Inquisitio verborum istorum . Thanks

irina060981 commented 4 years ago

I think this text with unusual formatting reveals a possible bug in our code. When we do a replacement of special characters in HTML during a selection, we sometimes change the length of a text. As a result, selection shifts.

@kirlat , there was another bug - https://github.com/alpheios-project/alpheios-core/pull/303 and that's why I updated the code to replace a sequence of special characters with only one char

I believe that here we have some conflict in different formatting May be it is worth to create a more intellectual char replacement

kirlat commented 4 years ago

Thanks for pointing to the other issue related to this. I agree it would be good to rework a replacement to make it more reliable.

I narrowed the problem down to the replacement pattern. For Latin, which this issue is about, it is ".,;:!?'\"(){}\\[\\]<>/\\\u00A0\u2010\u2011\u2012\u2013\u2014\u2015\u2018\u2019\u201C\u201D\u0387\u00B7\n\r\u200C\u200D". This pattern is anchorText = anchorText.replace(new RegExp([${punctuation}]+, 'g'), ' '), and sometimes it replaces multiple characters with a single space. With experimental approach I figured that if we remove \" and \n from the pattern the string length remains the same, but it leaves double quotes in a text, of course. What looks strange to me is the + at the end of the character set. If I read it correctly, it means "replace one or more characters from a set with a single space" which does not seem right, so probably it should be removed.

I'm also not sure I fully understand the character set: why [ and ] are escaped with double backslashes (if this is an escaping)? Shouldn't it be a single one? @balmas, maybe you remember something about this pattern? I think changing this part can have some unexpected side effects that would be very hard to catch later and I want to make sure I'm not missing anything before doing a change.

balmas commented 4 years ago

I'm also not sure I fully understand the character set: why [ and ] are escaped with double backslashes (if this is an escaping)?

It's possible the double-back slashes before the brackets are an error, I think at one point they were necessary but perhaps not anymore.

balmas commented 4 years ago

see also this PR which is where we added 200C and 200D to the list of punctuation characters for Google Docs https://github.com/alpheios-project/data-models/pull/89

kirlat commented 4 years ago

I'm testing a selection process in Google Docs (because this issue is related to #288 and #303) and I cannot reproduce any cases with special characters where we need a + quantifier (see https://github.com/alpheios-project/alpheios-core/pull/303/files#r420631680): it seems to work fine without it. @irina060981, do you know of any texts on which this could be tested? If we could remove the + that would probably resolve most of our issues with incorrect text selection.

I do, however, see issues with selecting short words at the end of the line, as @monzug reported earlier.

Google Docs inserts an empty span at the end of each line and, because of that, the following clause is executed: https://github.com/alpheios-project/alpheios-core/blob/master/packages/components/src/lib/selection/media/html-selector.js#L243-L250. I think this clause is intended to fix selection issues with FF. In our case the text is selected correctly yet the clause is still executed because of an empty span at the end of the line.

As a result, we're losing a start of selection ro and are selecting a whole string from which we then select a first word.

I will experiment to see what we can do about it without changing the current logic too much.

kirlat commented 4 years ago

I think our current business logic assumes that selection starts and ends within the same node. That's why we think it's a mistake when text of the start node does not match the text of the end node.

A case with Google Docs and words at the end, where a word is followed by an empty span (a different node), it is a valid case when start and end nodes do not match. Because of mouse move accuracy (10 pixels by default), when pointer is at the center of a two-letter word, or at the end of a long word, selection spills into the empty span next to the word, and we have a situation when start node and end node do not match and which is NOT an error.

@balmas, there is a comment in the code indicating we need this to fix selection errors in FF. I'm wondering how old is it and if it is still needed. FF went through many changes recently, even though there are some occasional bugs here and there.

balmas commented 4 years ago

I'm a little worried about making major changes to our selection code at this stage in the release It would help to know whether any of these problems exist in the current release.

kirlat commented 4 years ago

We do not have mouse move in the current release, so we cannot compare it. An empty span in Google Docs has a goog-inline-block class, so maybe we can check for that and, if present, ignore it? It might change in the future, but we can fix it for now, and we'll be able to make amend in the future. It will also not affect an existing workflow because it will be triggered only within Google Docs, where goog-inline-block items are present.

So I think we need two changes to make it work in Google Docs AND fix the double click issue:

  1. Remove a + quantifier from a character set. This will return it to the state that will match production and should be harmless.
  2. Ignore elements with a goog-inline-block class. This will be triggered only in Google Docs environment will not affect the regular workflow. It seems that those changes might have a minimal impact, but I'm afraid I'm missing any test cases that are harder and trickier than the ones I was testing with. Do you know of any special cases that would be good to test it?
balmas commented 4 years ago

I think we should definitely do "1. Remove a + quantifier from a character set. This will return it to the state that will match production and should be harmless."

I'm guessing that the problem with google docs could in theory happen on other pages with mousemove. Need to think about this a little bit more.

kirlat commented 4 years ago

I'm guessing that the problem with google docs could in theory happen on other pages with mousemove.

I think that's true. But do we currently have mosuemove activated only on Google Docs pages?

I've also noticed another issue: if there is a Google document opened in a tab, and an Alpheios is activated in it, a mousemove mode becomes activated on all other tabs as well: if to hold mouse over a word, it starts a lexical query, no matter what tab it is. I'm wondering if anyone else can reproduce it or is it just me.

balmas commented 4 years ago

I think that's true. But do we currently have mosuemove activated only on Google Docs pages?

It is only automatically enabled for Google Docs. But it can be turned on for any page if the user prefers, and that is the recommended setting for Chinese.

balmas commented 4 years ago

I've also noticed another issue: if there is a Google document opened in a tab, and an Alpheios is activated in it, a mousemove mode becomes activated on all other tabs as well: if to hold mouse over a word, it starts a lexical query, no matter what tab it is. I'm wondering if anyone else can reproduce it or is it just me.

That sounds like a bug we need to fix.

balmas commented 4 years ago

@balmas, there is a comment in the code indicating we need this to fix selection errors in FF. I'm wondering how old is it and if it is still needed. FF went through many changes recently, even though there are some occasional bugs here and there.

@kirlat is this the comment you are referring to?

  // firefox's implementation of getSelection is buggy and can result
    // in incomplete data - sometimes the anchor text doesn't contain the focus data
    // and sometimes the focus data and anchor text is just whitespaces
    // in these cases we just use the target textContent

It would probably be worth testing to see if that is still an issue -- I think it was this one but unfortunately it doesn't contain a test case https://github.com/alpheios-project/webextension/issues/53

balmas commented 4 years ago

A case with Google Docs and words at the end, where a word is followed by an empty span (a different node), it is a valid case when start and end nodes do not match. Because of mouse move accuracy (10 pixels by default), when pointer is at the center of a two-letter word, or at the end of a long word, selection spills into the empty span next to the word, and we have a situation when start node and end node do not match and which is NOT an error.

@kirlat could we just test for this scenario - i.e. an empty span - without restricting it to the googledocs class? Would there be any reason not to do this?

irina060981 commented 4 years ago

@kirlat , here it is a test with Monica's sample text https://docs.google.com/document/d/13DgFWex6L2r9jCXba7CW104KlD3U9nHRkG3bu_srROg/edit

try to select short greek words (it was not working without plus in replace regex)

Ὅσοις τῶν ἀνθρώπων ἐκ τῆς αὑτῶν2 ὁρμωμένοις χώρας ὑπερόριοί τε ἀγῶνες καὶ κίνδυνοι συμβαίνουσιν, ἄν τι σφάλμα γένηται

kirlat commented 4 years ago

here it is a test with Monica's sample text https://docs.google.com/document/d/13DgFWex6L2r9jCXba7CW104KlD3U9nHRkG3bu_srROg/edit

Thanks for a test reference, will test it on this.

kirlat commented 4 years ago

@kirlat could we just test for this scenario - i.e. an empty span - without restricting it to the googledocs class? Would there be any reason not to do this?

I think it is a good approach. I will implement this solution and will check if any issues will arise during testing.

kirlat commented 4 years ago

The changes that would hopefully fix it are merged to master.

irina060981 commented 4 years ago

@kirlat , while working on the current changes for issue - I have merged your changes and now - all selections for me with double-click reselect the word without the last char on http://www.thelatinlibrary.com/

image

At the same time there is no problems on https://www.loebclassics.com/

image

and on github it is well working too

image

balmas commented 4 years ago

@kirlat , while working on the current changes for issue - I have merged your changes and now - all selections for me with double-click reselect the word without the last char on http://www.thelatinlibrary.com/

I was seeing the same, now reported in #362

kirlat commented 4 years ago

Thanks for the reports, I will investigate this.

kirlat commented 4 years ago

I can reproduce it, working on a fix

kirlat commented 4 years ago

That was easy, somehow a + quantifier sneaked backed in into a replacement character set. Removing it seems to be fixing the issue.

irina060981 commented 4 years ago

@kirlat , did you find the reason why it creates problems on thelatinlibrary only?

kirlat commented 4 years ago

@kirlat , did you find the reason why it creates problems on thelatinlibrary only?

Yes, it's because it might replace multiple characters with one space; as a result, text length is changing and selection is shifting. This is not specific to The Latin Library, I think; it is specific to paragraphs that has multiple replacement characters grouped together that are replaced with a single space. I think there might be paragraphs without punctuation in The Latin Library where it will work and there might be paragraphs with punctuation on other pages where it won't. It all depends on a punctuation in a specific paragraph, I think.

balmas commented 4 years ago

Alpheios Components 3.3.0-qa.20200522457

monzug commented 4 years ago

it's working as expected. Thanks tested in FF and Chrome (both greek and latin words) and google docs