recogito / recogito-js

A JavaScript library for text annotation
BSD 3-Clause "New" or "Revised" License
349 stars 38 forks source link

Prevent creation of zero-width SPANs #83

Open rsimon opened 1 year ago

rsimon commented 1 year ago

RecogitoJS produces zero-width annotation spans in some cases. Don't do this.

Also see this related issue in the connections plugin: https://github.com/recogito/recogito-connections/issues/18

kdaniel21 commented 1 year ago

Hey @rsimon, we've also encountered the same issue. In our case the biggest problem was that when we opened annotations using selectAnnotation, the arrow pointed at the first (empty) span, which could be in a different row.

I've also made a quick reproduction: StackBlitz

Screenshot_20230324_214241

deaddecoy commented 11 months ago

I think I found a solution to this issue. Highlighter tries to match textNodes to start, end positions in the annotation. The original code did not appear to handle boundary conditions and would include preceding sections if the end touched the start of the annotation.

I edited Highlighter.js > calculateDomPositionWithin from ` calculateDomPositionWithin = (textNodeProperties, charOffsets) => { var positions = [];

textNodeProperties.forEach(function(props, i) {
  charOffsets.forEach(function(charOffset, j)  {
    if (charOffset >= props.start && charOffset <= props.end) {
      // Don't attach nodes for the same charOffset twice
      var previousOffset = (positions.length > 0) ?
            positions[positions.length - 1].charOffset : false;

      if (previousOffset !== charOffset)
        positions.push({
          charOffset: charOffset,
          node: props.node,
          offset: charOffset - props.start
        });
    }
  });

  // Break (i.e. return false) if all positions are computed
  return positions.length < charOffsets.length;
});

return positions;

}`

to

` calculateDomPositionWithin = (textNodeProperties, charOffsets) => { var startPosition = false; var endPosition = false;

textNodeProperties.forEach(function(props, i) {
  var charoffset = false;
  var offset = false;
  if (charOffsets[0] >= props.start && charOffsets[0] <  props.end) {
      startPosition = {
          charOffset: charOffsets[0],
          node: props.node,  
          offset: charOffsets[0] - props.start
      }
  } 
  if (charOffsets[1] > props.start && charOffsets[1] <= props.end) {
      endPosition = {
          charOffset: charOffsets[1],
          node: props.node,  
          offset: charOffsets[1] - props.start
      }
  } 
  // Break (i.e. return false) if start and end positions are not computed
  return ((startPosition != false) & (endPosition != false));
});

return [startPosition, endPosition];

}`

This has the added benefit of having relations snap to the correct nodes.