FreeUKGen / FreeCENMigration

Issue tracking for project migrating FreeCEN to FreeCEN2 genealogy record database and search engine architecture. Code developed here is based on that developed in MyopicVicar
https://www.freecen.org.uk
Apache License 2.0
4 stars 3 forks source link

Add Disability Search #757

Closed PatReynolds closed 4 years ago

Captainkirkdawson commented 4 years ago

Assuming this is simply a filter of a broad search then it is relatively straightforward

Captainkirkdawson commented 4 years ago

The rake task provides us with the following content;

1841 "`", "++++++", "Infirm", "Crippl" 1851 202 entries 1861 404 entries 1871 777 entries 1881 155 entries 1891 1237 entries Total 1879 entries

An Excel file with a sheet for each year has been uploaded to the drive at https://drive.google.com/drive/u/1/folders/0BykgQKwJtk6faHhUWnRsWklNWWc

Cen1 has a ticky to select disabled ie any form. Is that what we wish to do?

Note there are many garbage entries

FreecenBren commented 4 years ago

Column W on the FIELDS document for Disabilities says this. This is what transcribers go by which replaced INCENS instructions

Column W Disability Blank Up to six characters Strictly, the Census asked only "Deaf, Dumb, Blind"; many Enumerators also wrote such comments as "Imbecile", or "Cripple" We can record these options, with abbreviation as necessary (enter the full wording in Column Y). "Df&Dm", "Imbcil", "Crippl" would be acceptable.

The instructions given to transcribers suggests that if the Disability is too large to explain in 6 digits we can also add the full disability in the notes column.

Hoping that FreeCEN two might have more digits to use instead of the 6 maximum we use now when transcribing.

I have looked at the google docs and all the different ones we have used. As you see all are only 6 digits. That is why we add the longer version in Notes to hopefully explain the full disability .

Captainkirkdawson commented 4 years ago

Sorry @FreecenBren but I cannot comment on what should be there. All I can do is report on what is actually there. They are 6 or fewer characters. EG "Water" "Wathd" "Wd Leg" "Wd-Leg" "Wdnleg" "Weak H" "Weak I" "Weak M" "Weak S" "Weak" "Weakiq" "Weakly" "Weakmd" "Weakne" "Weakns" "Wekmnd" "Well" "W-Hand" "Whoop" "Wills" "Wise" "Wither" "Wk Int" "Wk Min" "Wk Mnd" "Wk.Int" "Wkbrn" "Wkhrt" "Wkint" "Wk-Int" "Wkint*" "Wkintl"

PatReynolds commented 4 years ago

Question is (I think) should search be on 'column w' or 'column Y' - as column Y is twys, and column W is a FreeCEN abbreviation, there are arguments for both. For column Y - will pick up entries if people are specifically looking for something (e.g. Limbless, or mentioning Heart). But such searches will be better served by providing access to the records as Open Data. Column W on the other hand is already used for a reduced set of possible disabilities.

So I think we should go with a ticky that returns any record with entry in W.

Captainkirkdawson commented 4 years ago

I am totally lost @PatReynolds we have no such fields as Y and W in individuals. The available fields are field :sequence_in_household, type: Integer field :individual_flag, type: String field :surname, type: String field :forenames, type: String field :name_flag, type: String field :relationship, type: String field :marital_status, type: String field :sex, type: String field :age, type: String field :age_unit, type: String field :detail_flag, type: String field :occupation, type: String field :occupation_flag, type: String field :birth_county, type: String field :birth_place, type: String field :verbatim_birth_county, type: String field :verbatim_birth_place, type: String field :birth_place_flag, type: String field :disability, type: String field :language, type: String field :notes, type: String

FreecenBren commented 4 years ago

This is the best I can do with what relates back to the FreeCEN Spreadsheet. Fields attached give you more details re the columns in the initial transcription.

Column Letter in FreeCEN Spreadsheet
Rake Report

System produced to form the final VLD field :sequence_in_household, type: Integer

System produced to form the final VLD field :individual_flag, type: String

I field :surname, type: String

J field :forenames, type: String

K = query on names etc. we use an X field :name_flag, type: String

L field :relationship, type: String

M field :marital_status, type: String

N field :sex, type: String

O field :age, type: String

Query Column P field :age_unit, type: String

field :detail_flag, type: String

Q field :occupation, type: String

R = occupation Status field :occupation_flag, type: String

T field :birth_county, type: String

U field :birth_place, type: String

Query column V

System produced to form the final in Valdrev field :verbatim_birth_county, type: String

System produced to form the final in Valdrev field :verbatim_birth_place, type: String

field :birth_place_flag, type: String

W field :disability, type: String

X field :language, type: String

Y field :notes, type: String

From: Kirk Dawson notifications@github.com Sent: 10 December 2019 16:15 To: FreeUKGen/FreeCENMigration FreeCENMigration@noreply.github.com Cc: Brenda bowbren2@gmail.com; Mention mention@noreply.github.com Subject: Re: [FreeUKGen/FreeCENMigration] Add Disability Search (#757)

I am totally lost @PatReynolds https://github.com/PatReynolds we have no such fields as Y and W in individuals. The available fields are field :sequence_in_household, type: Integer field :individual_flag, type: String field :surname, type: String field :forenames, type: String field :name_flag, type: String field :relationship, type: String field :marital_status, type: String field :sex, type: String field :age, type: String field :age_unit, type: String field :detail_flag, type: String field :occupation, type: String field :occupation_flag, type: String field :birth_county, type: String field :birth_place, type: String field :verbatim_birth_county, type: String field :verbatim_birth_place, type: String field :birth_place_flag, type: String field :disability, type: String field :language, type: String field :notes, type: String — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FreeUKGen/FreeCENMigration/issues/757?email_source=notifications&email_token=ADIL3VV65DHV256ZKY4VAYTQX66A5A5CNFSM4JJWFUO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGPZ4AQ#issuecomment-564108802 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIL3VVU7Q2CLTJEIHEM6XTQX66A5ANCNFSM4JJWFUOQ .

PatReynolds commented 4 years ago

Implement "Restrict search to persons with disabilities (tickybox)"

Captainkirkdawson commented 4 years ago

Code available for evaluation on test3 CEN.

PatReynolds commented 4 years ago

Search found two records, but neither was of a disabled person: image

Repeat with wider search has 9 results, two view 4 and view5, Eliza Alcock are correct, the others are false positives. image

Suspect is search interpreting a "dash" as being a positive result? I have tried other searches, false positives always have a dash in the disability field. Those without dashes are never false positves.

Captainkirkdawson commented 4 years ago

There are no false positives from a code perspective. The ticky box implementation does NOT look at the actual contents. It simply asks 'is there anything in this field' There is no quality assessment as to it content.
As the original checking of this field showed there are many strange entries. Look at the sheet for all years in the spreadsheet https://drive.google.com/open?id=1afnPVobQXI8o44_PuRSlYLRm5lVHstHo

PatReynolds commented 4 years ago

Need to add to help: Disability search can sometimes give a "false positive" - when you check the details, there is a dash or some other entry in the disability field, rather than an indication of disability.

Captainkirkdawson commented 4 years ago

deployed