DigitalMitford / DM_processing

a repo for working on processing for the Digital Mitford project, including schemas, XSLT, XQuery, and other production and analysis efforts
http://digitalmitford.org
GNU Affero General Public License v3.0
8 stars 3 forks source link

SI Add files: check for existing SI entries #22

Closed ebeshero closed 7 years ago

ebeshero commented 7 years ago

In an uncertain universe, a Mitford editor may not have a) noticed or b) labelled an SI-add entry as either new or update of an existing entry. We need to check as vigilantly as we can whether entries do already exist in either case.

To do that, let's try hunting for strings of text inside entries, as well as inside @xml:id's. Here is a handy formulation using the contains() function which will help us to find internal strings.

//*[contains(@xml:id, "rusoe")]

This will locate an xml:id containing anywhere inside "Crusoe" or "crusoe" (and doesn't matter whether at start or end).

Here's another potentially useful XPath formulation:

//person[contains(., " North ")]

This should into the internal text contents of each person element for the string " North " (with white spaces).

@jonhoranic @brookestewart

jonhoranic commented 7 years ago

@ebeshero With my initial pass of the si-ADD-MRMCorrespondents file, I located this new entry xml:id="Wordsworth_Dora" that conflicts with the site index's existing xml:id="Wordsworth_Dor".

I have commented out the entry (located at the bottom) and saved a local copy of the si-ADD (which has my initials in the filename).

What is the the best remedy to this issue? I will push the file in its current state in order to view.

EDIT: File pushed. The content is significantly more detailed then the existing entry.

ebeshero commented 7 years ago

@jonhoranic Jon--It seems likely the @xml:id was mistyped in the si-ADD-MRMCorrespondents file. Does the new entry add any information that isn't present in the existing si file at http://digitalmitford.org/si.xml?

ebeshero commented 7 years ago

@jonhoranic If it does add significant new info, we should fold that new info in to the current entry (and that may take some splicing and editing, something like what we worked out together on Thurs. for the Teresa Guiccioli entry...What I do is try to rewrite the suggested new entry in a way that folds the two entries together. Sometimes (like with the TG entry we worked on), I rewrite them both and take an opportunity to add something new that is likely relevant to Mary Russell Mitford's world.

If it really doesn't add anything, we can simply delete it. In that case, it was probably an oversight of the existing entry on my colleague's part.