FreeUKGen / FreeCENMigration

Issue tracking for project migrating FreeCEN to FreeCEN2 genealogy record database and search engine architecture. Code developed here is based on that developed in MyopicVicar
https://www.freecen.org.uk
Apache License 2.0
4 stars 3 forks source link

Use OCR+ machine learning to identify and extract data on images of census records #575

Open PatReynolds opened 5 years ago

PatReynolds commented 5 years ago

Traditionally, transcribers have not captured 100% of the place information on census records. This project is to do this (possibly based on an exisiting tool) for the fields where we have not done this previously

Priority: E2, U3, S5, C3?: P8

PatReynolds commented 5 years ago

@richpomfret can you add in the link to the FreePROBATE work here, please. @benwbrum please improve the description (i.e. what fields, and anything else).

PatReynolds commented 2 years ago

for Committee to review - possible solution to data missing in FreeCEN1 transcriptions

PatReynolds commented 2 years ago

for Committee to review - possible solution to data missing in FreeCEN1 transcriptions

PatReynolds commented 2 years ago

Number of rooms in 1891 and 1901 - missing but are being added manually.

Pat to check what others were missing.

PatReynoldsFUG commented 2 years ago

Information from start of ED in FreeCEN1 is missing in at least 40% of transcriptions.

Occupations column would also benefit from looking at again as character length had an impact.

WinCC used to sugest Ecclesiastical Parish from PARMS.

PatReynoldsFUG commented 1 year ago

I am not sure if this is still a valid story - has it been overtaken?

PatReynoldsFUG commented 1 year ago

Pat to look into this further (for discussion at our next meeting).

PatReynoldsFUG commented 7 months ago

https://docs.google.com/document/d/1oQA87M2x99ypnVmKQkORy23Kq8IqzsDwvT7ZMxSFT5M/edit?usp=sharing will contain examples (work in progress)

DeniseColbert commented 3 months ago

To review when Pat returns.