apache / incubator-annotator

Apache Annotator provides annotation enabling code for browsers, servers, and humans.
https://annotator.apache.org/
Apache License 2.0
218 stars 44 forks source link

For some webpages when highlighting a text, the text is splitted #129

Open raphael10-collab opened 2 years ago

raphael10-collab commented 2 years ago

For some webpages I'm getting this strange behavior :

image

for example (above) when trying to highlight the word "negligible", I split it in two parts: negl + igible

This is the code I'm using:

// https://annotator.apache.org/docs/getting-started/

import { describeTextQuote } from '@apache-annotator/dom'
import { createTextQuoteSelectorMatcher, highlightText } from '@apache-annotator/dom';

async function describeCurrentSelection() {
  const userSelection = window.getSelection()?.getRangeAt(0);

  console.log("view-preload.ts-userSelection: ", userSelection)

  if (!userSelection || userSelection.isCollapsed) return;

  console.log("view-preloads-describeTextQuote(userSelection): ", describeTextQuote(userSelection))

  return describeTextQuote(userSelection);
}

async function highlightSelectorTarget(textQuoteSelector) {
  const matches = createTextQuoteSelectorMatcher(textQuoteSelector)(document.body);

  console.log("view-preload.ts-highlighSelectorTarger-matches: ", matches)

  // Modifying the DOM while searching can mess up; see issue #112.
  // Therefore, we first collect all matches before highlighting them.
  const matchList = [];
  for await (const match of matches) matchList.push(match);

  for (const match of matchList) highlightText(match);
}

document.addEventListener('mouseup', async () => {

  const selector = await describeCurrentSelection();
  const existingSelectors = JSON.parse(localStorage[document.URL] || '[]');
  localStorage[document.URL] = JSON.stringify([...existingSelectors, selector]);
  await highlightSelectorTarget(selector);
})

// Highlight the last selection that was stored, if any.
async function highlightStoredSelectors() {
  if (localStorage[document.URL]) {
    const selectors = JSON.parse(localStorage[document.URL]);
    for (const selector of selectors) {
      console.log("view-preload.ts-highlightStoredSelectors-selector: ", selector)
      await highlightSelectorTarget(selector);
    }
  }
}
highlightStoredSelectors()

window.addEventListener('mouseup', (e) => {
  if (e.button === 3) {
    e.preventDefault();
    goBack();
  } else if (e.button === 4) {
    e.preventDefault();
    goForward();
  }
});

Why does this happen?

reckart commented 2 years ago

Try looking at the HTML code of the page at that location after you have selected the word - what does it look like? Does it make sense?

Treora commented 2 years ago

A quick look at that website (link) reveals it has a css rule for <mark> element: image Bootstrap styles it with padding, hence if you have two mark elements right besides each other (e.g. because you first highlight one part, then another), it will create the result you saw with padding between the elements.

We should fix most occurrences of this issue by normalising consecutive text nodes, to avoid multiple mark elements when we only need one. (I forgot what the state of this is, did we perhaps fix it already but is it not released yet? #80 )

Also it would be nice if the webpage’s CSS does not apply to your annotation tool’s highlights. Perhaps the mark elements can be given a rule all: revert !important;?

raphael10-collab commented 2 years ago

@Treora Interesting. Thank you very much . I do hope it will be fixed soon