pelagios / recogito2

Semantic Annotation Without the Pointy Brackets
Apache License 2.0
153 stars 30 forks source link

Update and streamline XPath generation #644

Closed hcayless closed 5 years ago

hcayless commented 5 years ago

This does a couple of things: it updates CETEIcean to the latest release, and deals with some of the fallout from that (default behaviors that insert text nodes into the DOM and can therefore interfere with annotation resolution). It also changes the way XPaths for annotations in the browser are generated, hopefully making them more concise and correct.

rsimon commented 5 years ago

Hi @hcayless,

many thanks. I tested locally and everything looks like its working just fine. Can I ask for few additional explanations/confirmations, so that I can (reasonably well ;-) understand what's going on?

hcayless commented 5 years ago

PathUtils.js is as you say. The biggest difference is that if it finds an @xml:id it doesn’t look any further up the tree, because that’s a unique ID. So it’ll generate a path starting with // + the element with the ID.

Highlighter.js just changes the XPath a bit. Instead of stripping off the leading character, which may no longer be enough to make it a relative path, it prefixes it with a ‘.’, making it a relative path starting with the context node.

As for the CETEIcean setup, that’s exactly right. Newer versions by default do some insertions to produce the browser-viewable result, and that can mess up the character offsets. For something like Recogito, we really want a DOM that’s (at least as far as the text nodes go) isomorphic to the source. There’s maybe a bit more work to be done there to make sure everything’s dealt with correctly, but for now I’m just removing most of the defaults.

rsimon commented 5 years ago

Got it, many thanks for the explanations. I'm merging this now & will push to production on Monday. (Because here at the Pelagios HQ, we never deploy to production on a Friday ;-)