FreeUKGen / FreeCENMigration

Issue tracking for project migrating FreeCEN to FreeCEN2 genealogy record database and search engine architecture. Code developed here is based on that developed in MyopicVicar
https://www.freecen.org.uk
Apache License 2.0
4 stars 3 forks source link

Establish a long term Parms correction process #911

Open Captainkirkdawson opened 4 years ago

Captainkirkdawson commented 4 years ago

2/ Recruit specialit 1/ Coding task

Captainkirkdawson commented 4 years ago

geoff.jarvis@freeukgenealogy.org.uk Tue, 7 Jul, 04:25 (8 days ago) to Kirk

Kirk

I have picked up two errors in the PARMS. The civil parish is incorrect on the TNA site for both

See row 2706. The correct name is Foxcote. The TNA have Forscote. I have checked the Gazetteer and no Forscote exists.

See row 3063. The correct name is Buckland Dinham. The TNA have Buckland Denham. I have checked the Gazetteer for this one also.

What do we do now?

Geoff

Captainkirkdawson commented 4 years ago

Kirk Dawson kirk.dawson.bc@gmail.com 7 Jul 2020, 16:46 (8 days ago) to Pat, Richard, Geoff, Kirk

Geoff, Pat and Richard

This is an interesting one and likely will be around for some time

I suggest that possible errors such as these be referred to the FreeCEN data manager who confirms that there is indeed an error.

If so that person requests that a system manager corrects the error online and also the TNA extract . That person should also inform TNA of the error.

This presupposes that FreeCEN has an official data manager (Freereg does)

Perhaps Pat or Richard may have other views

I will add an edit capability at the SA level

Captainkirkdawson commented 4 years ago

I am more than happy for us to have a means of correcting place names in our PARMS. Apart from straightforward errors in the TNA source data like the two Geoff has found, there is the possibility that my parsing of the source data will have 'got it wrong', leading to nonsensical 'names'.

Where the TNA data is agreed to be in error, I would suggest adding a Note 'recorded as X by TNA' so the original information is not lost.

The question of what we tell TNA is a separate issue. When I came to re-process the 1911 data, I found that a number of entries in the source data had been changed from "Lady Woon" to the (more likely) Ladywood. This suggests that they are actively updating their records when mistakes are found. However, I think we should check with them that they would welcome our input, and establish a mechanism which fits with their working practices. I'm happy to ask Guy Grannum (Head of Systems Development) who has answered my other queries during this project. To some extent it will depend on what they are using as a source for their catalogue records, and whether they consider fidelity to that source to be more important than historical accuracy of the data in their records. When Geoff talks of "the Gazetteer", which one does he mean? (Please pardon my ignorance here.) Is there scope to do a comprehensive automated check on all the civil parishes (etc.), to generate a list of non-matching names? That would give us a sense of the scale of the tidying-up job we will have to do, as well as identifying which names need to be checked. If we have no other source we could use, I could do this check using the Administrative Units data from the Vision of Britain team, but I would need to remind myself what the query syntax is. Richard

Captainkirkdawson commented 4 years ago

geoff.jarvis@freeukgenealogy.org.uk 8 Jul 2020, 04:39 (7 days ago) to Richard, Pat, me, Kirk

Richard

I use teo. Genuki and the England Gazetteer at https://www.gazetteer.org.uk/ (This has 19th Century Parishes.

In this case I used the England Gazetteer.

Having been validating for about 12 years I also knew it was wrong and what it should be but I did verify it in case there were 2 versions of the spellings.

Geoff

Captainkirkdawson commented 4 years ago

Geoff,

Neither of these sites is particularly helpful for automated checking, being traditional search facilities where you have to run one search at a time and gaze in wonder at the results. It's a pity that neither site has an API.

I tried Buckland Dinham in GenUKI and in fact it did come up with both spellings:

https://www.genuki.org.uk/gazetteer?place=Buckland%20Dinham&county=-1&search_type=1&display_type=2

Richard

PatReynolds commented 4 years ago

@PatReynolds to write a role description (and where to recruit).

PatReynolds commented 4 years ago

Having discussed with @richardofsussex - @PatReynolds to talk to TNA about us feeding back / sharing linked data; it would be good to know when recataloguing takes place, but we will not necessarily take any action. @PatReynolds to talk to other holders of census records to see how their cataloguing data (perhaps not visible to the public) is structured, and if we can use it / return it as linked data.

PatReynolds commented 3 years ago

Contacted TNA 4 Jan 2021

DeniseColbert commented 3 years ago

Next steps need clarifying @PatReynolds

PatReynolds commented 3 years ago

@PatReynolds tio chase TNA

PatReynolds commented 3 years ago

Chased 16 June 2021

PatReynolds commented 3 years ago

Contact at TNA says: ". It would be very good to have a record of any catalogue errors you’ve detected. Can you give me an indication of the scale/volume of errors and the format of your report? I’m sure it’ll contain the citable references but let me know a little more please."

geoffj-FUG commented 2 years ago

We now have a process of correcting PARMS within FreeCEN. The errors in the PARMS show up as an error for the Civil Parish in the Error report or No POB report. These are the first reports run by a transcriber and so the error is identified at transcription. The Coordinator downloads the transcription to provide a backup copy. Then removes the transcription from FC2. Rhoda (data manager) is advised by the Coordinator of any amendments to the PARMS required. She makes the change and advises the Coordinator. The Coordinator then reloads the transcription to the transcriber's folder on FC2. There is no established process for advising TNA that I am aware of. Rhoda may have developed one. Geoff