fisharebest / webtrees

Online genealogy
https://webtrees.net
GNU General Public License v3.0
451 stars 298 forks source link

United States Census Assistant #810

Closed Shemwell closed 8 years ago

Shemwell commented 8 years ago

Missing Census years 1790 thru 1840 & 1940.

1850 - Names read First-Last, Birth Place reads "USA" not State, MAR column enters Y for married for all years after 1849? should only be within the year, does not exceed page. 1860 - Names read First-Last, Birth Place reads "USA" not State, MAR column enters Y for married for all years after 1859? should only be within the year, does not exceed page. 1870 - First-Last, Birth Place reads "USA" not State, Enters Y for all FFB & MFB unless no parents entered, month MAR column enters month married for all years after 1869? should only be within the year, does not exceed page. 1880 - First-Last, All Birth Place's read "USA" not State, Census gives tick columns for Married Divorced Widowed, Census Asst. enters nothing, I think a single column with M/D/W entered would be sufficient and easier to read.
1890 - First-Last, all Birth Place's read "USA" not State, missing married/single, missing children born & children living, month MAR column enters month married for all years after 1889? should only be within the year, does not exceed page. 1900 - First-Last, All Birth Place's read "USA" not State, missing married/single, does not exceed page. 1910 - Last-First, All Birth Place's read "USA" not State, missing married/single, exceeds page width. 1920 - Last-First, All Birth Place's read "USA" not State, missing married/single, does not exceed page. 1930 - Last-First, All Birth Place's read "USA" not State, missing married/single, missing Age married, exceeds page width.

State/Country name abbreviations are also missing. Previous Census Asst. showed names as First-Last, some 1.7.3 read L/F, (US census reads L/F). I already have over 400 with F/L and it's easier to read(my preference) otherwise is there a easy method to convert the ones already entered for consistency?. Could column headers be abbreviated to make it smaller on page?. Many columns in census records are left blank, if there's nothing entered could they be omitted?.

Thanks

fisharebest commented 8 years ago

Issues with more than one item are very difficult to manage.

Missing censuses - I had difficulty finding scans of census images for the older censuses that were of high enough resolution to be readable. If you'd like to contribute these, the format is very simple. You can copy one of the existing ones, such as

https://github.com/fisharebest/webtrees/blob/master/app/Census/CensusOfUnitedStates1850.php

1850 - Names read First-Last,

Name reads first-last. Is this a statement or a question? The examples on wikipedia shows first-name, lastname - e.g. https://upload.wikimedia.org/wikipedia/commons/1/17/1850_census_Lincoln.gif - so I assume this is correct.

Birth Place reads "USA" not State,

Ah. The logic is to show the country - unless the birth country matches the census country, in which case the next level is shown. I guess that you are using the abbreviation USA instead of the name United States, and "USA" doesn't match "United States". I suspect that "Florida, United States" will show as "Florida", but "Florida, USA" will show as USA.

MAR column enters Y for married for all years after 1849? should only be within the year,

The census date is 1 Jun 1850, so I assumed this question to mean the 12 months preceding the census. Are you saying it means simply the same calendar year - i.e. the last 6 months?

does not exceed page.

Sorry - I do not understand what you mean by this.

Could column headers be abbreviated to make it smaller on page?

As you will see from the code that I linked above, each column has both a full title (the exact text from the original census), and a short abbreviation. I tried to retain some of the abbreviations that were used on the old census assistant, where they made sense.

I'm more than happy to accept suggestions for improved column headings, etc.

Shemwell commented 8 years ago

Ok, I’ll give it a shot, one item at a time. Names read First-Last.

US census records 1790 thru 1860 read First name-Last name, US census records 1870 thru 1940 read Last name, First name.

Development Version (and 1.7.3) prior to 1910 reads F-L, 1910 thru 1930 read L,F. If the goal is to be like the US Census, 1870 on should read L,F.

CensusOfUnitedStates1900.php line 38 - new CensusColumnFullName($this, 'Name', 'Name'),

CensusOfUnitedStates1910.php line 38 - new CensusColumnSurnameGivenNameInitial($this, 'Name', 'Name'),

The Last name is almost always only given for head of household then a ---- or “space” or “” , First name, unless an individual in the household has a different Last name, however some census takers added the last name for all individuals, some just last name once and then the initials of family members. I’m sure the intent was to indicate the same last name.

123 Last, First, head -----, First, wife First, son Last, First, father in law -----. First, mother in law 124 Last, First, head Etc…

The opposite applies to census records prior to 1910.

123 First, Last, head First, -----, wife Etc…

Due to inconsistencies of the census takers an accurate transcription using the assistant would be nearly impossible.

The old census asst. simply reads all records First name Last name. I personally prefer this method as it is consistent and much easier to read.

It would be a really nice enhancement if you could link the name in the saved assistant record to the individual.

How do you wish to proceed?.

fisharebest commented 8 years ago

You've already found the census definition files - and they were designed to be easy to read.

So, would it be quicker and easier if you just gave me the list of CensusColumnXXX() definitions for each year.

Shemwell commented 8 years ago

Sorry... Looking at the 1790 Census, I get CensusColumnFullName & CensusColumnOccupation then I'm lost. I found that if I changed in CensusOfUnitedStates from United States to USA the non-abbreviated State names appear. Here are links for easily readable census records. 1790 - http://www.mymcpl.org/_uploaded_resources/MGC-1790censusblank.pdf 1800 - http://www.mymcpl.org/_uploaded_resources/MGC-1800censusblank.pdf 1810 - http://www.mymcpl.org/_uploaded_resources/MGC-1810censusblank.pdf 1820 - http://www.mymcpl.org/_uploaded_resources/MGC-1820censusblank.pdf 1830 - http://www.mymcpl.org/_uploaded_resources/MGC-1830censusblank.pdf 1840 - http://www.mymcpl.org/_uploaded_resources/MGC-1840censusblank.pdf 1850 - http://www.mymcpl.org/_uploaded_resources/MGC-1850censusblank.pdf 1860 - http://www.mymcpl.org/_uploaded_resources/MGC-1860censusblank.pdf 1870 - http://www.mymcpl.org/_uploaded_resources/MGC-1870censusblank.pdf 1880 - http://www.mymcpl.org/_uploaded_resources/MGC-1880censusblank.pdf 1890 - http://www.mymcpl.org/_uploaded_resources/MGC-1890censusblank.pdf 1900 - http://www.mymcpl.org/_uploaded_resources/MGC-1900censusblank.pdf 1910 - http://www.mymcpl.org/_uploaded_resources/MGC-1910censusblank.pdf 1920 - http://www.mymcpl.org/_uploaded_resources/MGC-1920censusblank.pdf 1930 - http://www.mymcpl.org/_uploaded_resources/MGC-1930censusblank.pdf 1940 - http://www.mymcpl.org/_uploaded_resources/MGC-1940censusblank.pdf

Shemwell commented 8 years ago

Got it... sort of... but I'll need your input. Until the 1850 census a family was listed on a single row. Just the head of household is named and all others are only counted by "in between ages" 16<26 etc. I think if we manually enter the ages number data on the top HOH row (like the census), and then allow for the names of the individual with relationship/sex/age (not in the census) would be very useful determining who these numbers are in relation to individuals, it's just not census specific. How/where do I add abbreviations for state, countries etc.?. ie: Australia = AUS, Alabama = AL like in the old assistant in the census-edit.php?. I'll send them all to you this weekend for your professional tweaking and approval.

Shemwell commented 8 years ago

Census.zip Some of the Census records have way too many columns for the page and some only require the name of HOH, with tick marks or numbers for age brackets, so I did the best I could. I'll work on it some more. CensusColumnConditionEnglish($this, 'Cond', 'Whether single, married, widowed, or divorced') Widowed does not show up, condition stays Married, and I still have no clue how to add abbreviations for countries or states. Sorry, wish I could have done a better job...

sleebooth commented 8 years ago

Thanks for all the work you are doing on this Regards John On 31/01/2016 22:47, Shemwell wrote:

Census.zip https://github.com/fisharebest/webtrees/files/111552/Census.zip Some of the Census records have way too many columns for the page and some only require the name of HOH, with tick marks or numbers for age brackets, so I did the best I could. I'll work on it some more. CensusColumnConditionEnglish($this, 'Cond', 'Whether single, married, widowed, or divorced') Widowed does not show up, condition stays Married, and I still have no clue how to add abbreviations for countries or states. Sorry, wish I could have done a better job...

— Reply to this email directly or view it on GitHub https://github.com/fisharebest/webtrees/issues/810#issuecomment-177645891.


This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

fisharebest commented 8 years ago

I've merged in your changes for the 1850-1940 censuses.

"USA" / "United States" needs more thought. Fixing this for you will break it for others...

The earlier censuses need more thought.

Sorry, wish I could have done a better job...

On the contrary! The .ZIP file was exactly what I needed, and the .PDFs are very useful

Shemwell commented 8 years ago

The reason I use USA is because of the Google map add-on in the Google Maps™ module. If I use United States it doesn't work. Should I change them all and rework the .csv in Google maps?.

wkitty42 commented 8 years ago

this linking with google maps is exactly why i go through the trouble to massage all data in my gedcoms being imported to conform... i actually wrote a pascal program some time back to do this but it was cheap and fast and needed to be hardcoded to fix what was being worked on at the time... it is a painful process and i totally dread trying to work with merging newer data from other programs... the few users we do have working on our gedcoms still have to do things their way which leads to more work for the admins to clean up :( :( :(

fisharebest commented 8 years ago

IIRC, the googlemap download files use chapman codes, because it was intended that users would search/replace the code to the target country name. For example, I would s/ENG/England/ whereas a French user would s/ENG/Angleterre/.

As for cleaning up your data, the control panel has a "change place names" option, which would allow you to convert all your data from "USA" to "United States" in one go.

For the GoogleMaps data, you could export it to a CSV file, run a search/replace on that, and then re-import it?

wkitty42 commented 8 years ago

on cleaning my data, the country id is not so bad... what is the worst is when folks do not leave empty fields for sections they do not know... city, county, state, country... some will leave out the county and others will list them in the opposite order... like i said, it is nasty and painful to have to redo them when importing from other programs... especially with ~30000 individuals and ~10000 marriages... anyway, i only tossed that out there due to the conversation about country ids and linking in googlemaps...

ddrury commented 8 years ago

Just noticed, I have a US 1940 census image which broadly matches that given in http://www.mymcpl.org/_uploaded_resources/MGC-1940censusblank.pdf as noted above (layout different - columns the same) but neither of these match the column list in CensusOfUnitedStates1940.php.

I'll submit a PR if necessary but I'd like someone with more knowledge that me to :- a) confirm I'm right and b) determine which columns (if any) could be ignored