Closed hcayless closed 4 years ago
Hi @hcayless,
ah great, thanks. The code is here:
However: could this cause issues with the production instance? Unless there's a solution that can read & render both the old and the new XPaths, would I need to migrate the existing annotations that are already in our index?
Cheers, Rainer
And that's the part where a the DOM location is restored by parsing the XPath:
If I'm reading the code right, you're translating the path here: https://github.com/pelagios/recogito2/blob/b91d8c0cd29734d360dfdc6bd8a04cafdd60d82a/app/controllers/HasTEISnippets.scala#L19-L29. I guess it would be up to you whether to update the index or just keep doing the replacements. They'd be no-ops for correct paths, so it would do no harm apart from costing you a few milliseconds.
Just to confirm: we're talking exclusively about the proper TEI capitalization here, right?
I'm inclined to leave the replaceAll
statements in, for now. As you say, it would only add milliseconds for correct paths.
In the long run, yes, I'm all for updating the stored XPaths in the production instance, too. However, I'd need some spare time to write the script that scrolls through ElasticSearch, rewrites the paths, and updates the records (and the courage to hit "run" on that script in production ;-) I.e. if the solution allows for a transition period, that would be better ;-)
Yes, that's right. And the transition period can be as long as you want if you leave the replacement code in. From my perspective, I just want to make it easier for external software to deal with exported annotations—they don't currently get rewritten during export.
Ah, yes - good point! Let‘s do this then :-)
Hi @hcayless, just pinging about this - I think you already added this in one of your last pull requests, right? If so, we can close the issue.
Cheers, R
Given a from anchor like
/tei/text/body/div[@id='edition-text']/div[@id='part1']/p[@id='p1']/seg[@id='seg-1.1']::146
, it would be worth getting the XPath "TEI-correct", meaning getting the steps to follow TEI naming conventions. So it would be/TEI/text/body/div[@id='edition-text']/div[@id='part1']/p[@id='p1']/seg[@id='seg-1.1']::146
.CETEIcean will store the original, properly-cased, element name in a
@data-teiname
attribute on each element. If you can point me at the spot where you're generating the anchors, I could probably patch it to use that instead. And that would mean you won't have to paper over the case inconsistencies.