FreeUKGen / FreeCENMigration

Issue tracking for project migrating FreeCEN to FreeCEN2 genealogy record database and search engine architecture. Code developed here is based on that developed in MyopicVicar
https://www.freecen.org.uk
Apache License 2.0
4 stars 3 forks source link

Update New parms for 1851 #902

Closed Captainkirkdawson closed 4 years ago

Captainkirkdawson commented 4 years ago

@richardofsussex This is the processing log file for 1851 1851-tidied-2.xml https://app.zenhub.com/files/28748917/e776697d-5363-4a34-93b1-2343e7ee5b37/download

richardofsussex commented 4 years ago

Did you manage to get over the prenote issue? (It isn't that new - I introduced it a few iterations back because you're not allowed to repeat attribute names within an element.) If it's still a problem let me know, and I'll merge prenote and note into a single note attribute. I assume it's not critical info for our purposes anyway.

Captainkirkdawson commented 4 years ago

Thanks @richardofsussex as you say its not a critical field. I only came to notice it while processing 1851 into the database and I had a barf. Have now accommodated it in the parish model so no longer an issue. WRT the processing log some of the issues may be mine, some may not.

richardofsussex commented 4 years ago

@Captainkirkdawson I've been through your log, and it tallies about 95% with what I picked up in my own tests (which is encouraging). I missed C3762952, which is Beaminster, and the Scilly Isles subdistrict really does have no parishes. There are three 'damaged' entries which ended up without names. The rest I have already got an update file for, and will meld in the updated entries in due course.

Captainkirkdawson commented 4 years ago

@richardofsussex I have reprocessed 1851-tidied-2.xml to correctly process the Islands in the British seas. I also updated the ingest software to handle a couple of other issues. No new issues were located.

Captainkirkdawson commented 4 years ago

@richardofsussex. Have now started to look at the data on line and found a few issues There is a data extraction issue with a few districts. eg <district tnaid="C3762924" and C3762909 Several of the parish names contain an element Ward: Great Crosby eg That element should be extracted into a child. There are also many districts for liverpool starting at C3762912 where there are no subdistricts or parishes

richardofsussex commented 4 years ago

@Captainkirkdawson would you prefer these to be encoded as <ward> or as <hamlet type="ward">?

richardofsussex commented 4 years ago

... or <township type="ward">?

Captainkirkdawson commented 4 years ago

@richardofsussex encoded as <ward> please

richardofsussex commented 4 years ago

Sorry: I can't tell which option you want, since the comment swallowed your markup!! (Use the <> insert code toolbar button to escape it)

Captainkirkdawson commented 4 years ago

Updated comment; sorry about that

richardofsussex commented 4 years ago

1851-modes-2.zip Dealt with the missing wards, plus some other minor tidy-ups. Is it OK now?

Captainkirkdawson commented 4 years ago

@richardofsussex The ward extraction has caused us to loose the piece number ie name

<county tnaid="C132793" name="Lancashire" year="1851">
         <district tnaid="C3762909"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="1" name="St Martin" piece="HO 107/2176">
               <parish name="Liverpool">
                  <ward name="Scotland"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762910"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="1" name="St Martin" piece="HO 107/2177">
               <parish name="Liverpool">
                  <ward name="Scotland"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762911"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="2" name="Howard Street" piece="HO 107/2178">
               <parish name="Liverpool" note="(4) (6)">
                  <ward name="Vauxhall"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762912"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 3" name="Dale Street">
               <parish name="Liverpool">
                  <hamlet type="ward" name="St Paul's and Exchange"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762913"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 4" name="St George">
               <parish name="Liverpool">
                  <hamlet type="ward" name="Castle Street and St Peters" note="(3)"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762914"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 5" name="St Thomas">
               <parish name="Liverpool">
                  <hamlet type="ward" name="Pitt Street and Great George"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762915"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 6" name="Mount Pleasant">
               <parish name="Liverpool">
                  <hamlet type="ward" name="Rodney and Abercromby"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762916"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 7" name="Mount Pleasant">
               <parish name="Liverpool">
                  <hamlet type="ward" name="Rodney and Abercromby" note="(3) (4)"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762917"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 7" name="Islington">
               <parish name="Liverpool">
                  <hamlet type="ward" name="Lime Street and St Anne's"/>
               </parish>
            </subdistrict>
         </district>
         <district tnaid="C3762918"
                   code="461"
                   name="Liverpool"
                   note="(Street Indexed)"
                   year="1851">
            <subdistrict code="RS 7" name="Islington">
               <parish name="Liverpool">
                  <hamlet type="ward" name="Lime Street and St Anne's" note="(1)"/>
               </parish>
            </subdistrict>
         </district>
richardofsussex commented 4 years ago

1851-modes-2.zip This should fix that issue.

Captainkirkdawson commented 4 years ago

TY Completed