Closed zoonomen-APP closed 1 week ago
Now look at Washington_Co. Maine.
awk 'BEGIN {FS=OFS="|"} $4~/Maine/&& $3~/Washington_Co./ {print $99}' df|e '[A-z]'|wc -l
4978 total collector strings.
awk 'BEGIN {FS=OFS="|"} $4~/Maine/&& $3~/Washington_Co./ {print $99}' df|e '[A-z]'|e 'Ms\.'|wc -l
1327 strings contain "Ms." -- 27%
Continuing to look for pre and post 1965 difference.
$ awk 'BEGIN {FS=OFS="|"} {print $99}' NEdfbn|e '[A-z]'|wc -l
107019
awk 'BEGIN {FS=OFS="|"} $100>1965 {print $99,$100}' NEdfbn|e '[A-z]'|wc -l
55244 So 55244 records >1965.
awk 'BEGIN {FS=OFS="|"} $100>1965&&$99~/Ms\./ {print $99,$100}' NEdfbn|e '[A-z]'|wc -l
11118 so 11118/55244 = 20 % post 1965.
awk 'BEGIN {FS=OFS="|"} $100<=1965 {print $99,$100}' NEdfbn|e '[A-z]'|wc -l
So 52435 total equal or before 1965.
awk 'BEGIN {FS=OFS="|"} $100<=1965&&$99~/Ms\./ {print $99,$100}' NEdfbn|e '[A-z]'|wc -l
7239
so 7239/52435 = 14 % (and I expect mostly Cummings).
-Redo for NEdfbn (previously done on df)
awk 'BEGIN {FS=OFS="|"} {print $99}' NEdfbn|e '[A-z]'|wc -l
-106737 total collector entries.
awk 'BEGIN {FS=OFS="|"} {print $99}' NEdfbn|e '[A-z]'|e 'Ms\.'|wc -l
135114
19230
so 14% -- well below what Jim thought would be worth commenting on.
Lets look at Grafton Co. $ awk 'BEGIN {FS=OFS="|"} $3~/Grafton_Co./ {print $99}' df|e '[A-z]'|wc -l 6699
$ awk 'BEGIN {FS=OFS="|"} $3~/Grafton_Co./ {print $99}' df|e '[A-z]'|e 'Ms.'|wc -l 2387
so 36% well above Jim's 20% limit.
How about Aroostook Co. $ awk 'BEGIN {FS=OFS="|"} $3~/Aroostook_Co./ {print $99}' df|e '[A-z]'|wc -l 15542
Alan@DESKTOP-30GEGVP MINGW64 /c/Lichen/datastor/NE/MA/sb (master) $ awk 'BEGIN {FS=OFS="|"} $3~/Aroostook_Co./ {print $99}' df|e '[A-z]'|e 'Ms.'|wc -l 1343
9%... the Selva effect.
How about Norfolk.
$ awk 'BEGIN {FS=OFS="|"} $3~/Norfolk_Co./ {print $99}' df|e '[A-z]'|e 'Ms.'|wc -l 1172
Alan@DESKTOP-30GEGVP MINGW64 /c/Lichen/datastor/NE/MA/sb (master) $ awk 'BEGIN {FS=OFS="|"} $3~/Norfolk_Co./ {print $99}' df|e '[A-z]'|wc -l 2410
49%