nationalarchives / hms-nhs-scripts

MIT License
0 stars 0 forks source link

Flag cases where other than X transcribers were used for autoresolution #17

Closed bogden1 closed 2 years ago

bogden1 commented 2 years ago

X is mostly likely 3, as this is the number of classifications required. More classifications happen in some cases and sometimes less classifications happen where one or two transcribers have entered blanks in text fields.

(There is another complication around what happens when the same person has entered more than one transcription, but let's not get into that and just treat all classifiers the same way, even when the same person happens to pop up more than once.)

views_joined.csv already provides this information. The only thing to do here is to flag for manual checks in joined.csv when the number of views on which a text autotranscription is based is less than 3. (It should never be less than 3 for a dropdown, but we can put in a belt-braces check for this while we're at it.)

bogden1 commented 2 years ago

count_text_views does actually count blanks. So this situation can only arise when running with --unfinished, in which case we are explicitly asking to let through incomplete data. So I think this can be rejected.

A sanity check confirms that no cells in my current reference views_joined.csv contain a value under 3. The "port sailed out of" cells in volume 1 are empty, as expected. No other cells are empty.