Renumber lines in TEI - Githubissues

nichgray commented 5 years ago

Right now there is a discrepancy between the line numbers as they appear in the interface and as they appear in the @xml:id in the TEI. Consider updating these in TEI in some automated fashion.

(Remember that manuscripts and relations document link to existing IDs; these would need to be updated as well if this change is made.)

nichgray commented 4 years ago

Okay, here I think is what needs to be done:

[x] ppp.01880.xml (in source/tei): all @xml:id s whose values begin with "l" need to be renumbered one number higher, so for instance <l xml:id="l325"> would become <l xml:id="l326">, or <l xml:id="l0"> -> <l xml:id="l1">
[x] anc.02134.xml (in source/authority): all strings in @target that begin with "ppp.01880.xml#l" should have the part after "l" renumbered one number higher, so for instance <link cert="low" target="ppp.01880.xml#l47 #ab01"/> would become <link cert="low" target="ppp.01880.xml#l48 #ab01"/>
[ ] all files in /var/local/www/cocoon/whitmanarchive/manuscripts/tei/ and /var/local/www/cocoon/whitmanarchive/manuscripts/notebooks/tei/ with @corresps on any element that have ids matching an id in anc.02134.xml (or ppp.01880.xml) should have those @corresp attributes removed. For example, <l xml:id="l02" corresp="#l928"> would become <l xml:id="l02"> and <seg xml:id="tw02" corresp="#l507"> would become <seg xml:id="tw02">

nichgray commented 4 years ago

I should clarify, per the second checklist item above, that ONLY the part between the "l" and the second "#" is affected (so for instance nothing after "#ab" should change in the above example)

jduss4 commented 4 years ago

Third item -- match the corresp value, not the element id

jduss4 commented 4 years ago

I have completed tasks 1 and 2, but after writing a script that CAN remove all those corresps, it was deemed too destructive because Nokogiri is also reformatting things that wwa likes return characters between <add> and <del> type elements which they need to appear as spaces. @nichgray will be working on an XSLT script to scrub the corresps instead.

https://github.com/whitmanarchive/whitman-LG_1855_variorum/blob/master/scripts/one-offs/line_changer/remover.rb

whitmanarchive / whitman-LG_1855_variorum

Renumber lines in TEI #74