synthetichealth / synthea

Synthetic Patient Population Simulator
https://synthetichealth.github.io/synthea
Apache License 2.0
2.09k stars 623 forks source link

Problems in transforming the project into Pennsylvania #246

Closed XYGUAN closed 6 years ago

XYGUAN commented 6 years ago

This project is really fantastic and amazing! I am trying to expand the project not only in MA, but in PA or maybe in the future the users can choose which states they want to focus. However, during the process, I come across with some problems:

I have updated the code and data file for PA and it can successfully transform the census data for PA. However, when I try to generate the patients data, I found that we still need three more data to be updated for PA:

Currently all these three files only contains the information for MA, which will result in the inaccurate patients data. I have tried to find out how to generate these files this morning, but gets stuck. If it is convenient, would you mind providing me some suggestions about how to get/create these three data files?

Thank you so much for the help!

Best, Xiuyang

jawalonoski commented 6 years ago

@XYGUAN Good timing! I was just about to email you, but you beat me to it...

It seems like you’ve made it beyond this point, but I’ve been addressing some of issues with census data in this thread: #206

On to the files you asked about…

ghost commented 6 years ago

city_zip.json : if you know the county or FIPS code of your city you can get the ZIP from this dataset

https://wonder.cdc.gov/wonder/sci_data/codes/fips/type_txt/cntyxref.asp

XYGUAN commented 6 years ago

@nyquist212 Thank you very much for your help! I will try to download the ZIP from the link you sent to me.

@jawalonoski Thanks for sharing the branch of the "other_usa_states". I have tried it today and it works successfully, Wow! (In the weekend I have tried another methods to successfully generate the patients data, but I changed something in the code..., there maybe something incorrect).

I have only one more question, in the world folder, what I saw is only the MA geo_data. Will it influence the information of the synthetic patients of the geological data? Or it just involves the further data mapping and data visualization?

Thank you so much for your help again! :)

jawalonoski commented 6 years ago

The other_usa_states branch will not generate lat/lon information for the Patient's address if the town does not exist within the lib/world/MA_geo.rb file.

So unless I make more changes, if you want to visualize the data on a map, you'll need to generate those lat/lon coordinates separately. You cannot plot the synthetic addresses, because the synthetic addresses do not exist in real street maps.

XYGUAN commented 6 years ago

@jawalonoski Thank you very much! But if I add some PA data in the MA_geo.rb as the same data structure, will it work? Or I need to change more in the other code file?

jawalonoski commented 6 years ago

The current hack is to filter out states other than MA, but that is a one-line change to fix if you add the data.

XYGUAN commented 6 years ago

No problem. I will try to do it. Thanks a lot for your help! It is really an amazing project!

ghost commented 6 years ago

I think I've hit this wall too -> ./lib/world/MA_geo.rb

All my patient birthplaces and home addresses are in MA. Any guidance or recommendation on how to create these files for other States?

_./lib/world/city_zip.json ./lib/world/MA_geo.rb

Also, what is the purpose of this file? I did not see any Facility field in the output data.

./lib/world/hospital.rb_

XYGUAN commented 6 years ago

@nyquist212 The way I am trying to avoid these data files are:

Not sure whether it is OK or not. Hope it will be helpful.

ghost commented 6 years ago

@XYGUAN

Any idea why I'm getting this error when I run synthea:census?

Admin@Admins-MacBook:~/Public/synthea> bundle exec rake synthea:census --trace
** Invoke synthea:census (first_time)
** Execute synthea:census
rake aborted!
NoMethodError: undefined method `[]=' for nil:NilClass
lib/tasks/tasks.rake:337:in `block (3 levels) in <top (required)>'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1771:in `each'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1148:in `block in foreach'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1299:in `open'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1147:in `foreach'
lib/tasks/tasks.rake:329:in `block (2 levels) in <top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:250:in `block in execute'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:250:in `each'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:250:in `execute'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:194:in `block in invoke_with_call_chain'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:187:in `invoke_with_call_chain'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:180:in `invoke'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:152:in `invoke_task'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:108:in `block (2 levels) in top_level'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:108:in `each'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:108:in `block in top_level'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:117:in `run_with_threads'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:102:in `top_level'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:80:in `block in run'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:178:in `standard_exception_handling'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:77:in `run'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/exe/rake:27:in `<top (required)>'
/Users/Admin/.rvm/rubies/ruby-2.4.2/bin/rake:23:in `load'
/Users/Admin/.rvm/rubies/ruby-2.4.2/bin/rake:23:in `<top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `load'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `kernel_load'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:28:in `run'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/cli.rb:424:in `exec'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/cli.rb:27:in `dispatch'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/cli.rb:18:in `start'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/exe/bundle:30:in `block in <top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/lib/bundler/friendly_errors.rb:122:in `with_friendly_errors'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/bundler-1.16.0/exe/bundle:22:in `<top (required)>'
/Users/Admin/.rvm/rubies/ruby-2.4.2/bin/bundle:23:in `load'
/Users/Admin/.rvm/rubies/ruby-2.4.2/bin/bundle:23:in `<main>'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/bin/ruby_executable_hooks:15:in `eval'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/bin/ruby_executable_hooks:15:in `<main>'
Tasks: TOP => synthea:census

SD Census Data Files.zip

XYGUAN commented 6 years ago

@nyquist212 That is the error in the code, because the your states is not MA, so the ruby says it select nothing from the models.

You should use the branch "other_usa_states": https://github.com/synthetichealth/synthea/tree/other_usa_states to generate the census data. And firstly before run the code, you should replace the 4 data files in the ./resources as described in this thread #206.

Hope it is helpful.

ghost commented 6 years ago

@XYGUAN

I'm running the "other_usa_states" tree with my census files (for SD) in resources/SD.

I'm editing lib/tasks/tasks.rake to point to these files but the synthea:census seems to abort with a very cryptic error. Any idea what's going on here?

Admin@Admins-MacBook:~/Public/synthea/synthea> bundle exec rake synthea:census --trace
** Invoke synthea:census (first_time)
** Execute synthea:census
rake aborted!
NoMethodError: undefined method `[]=' for nil:NilClass
lib/tasks/tasks.rake:330:in `block (3 levels) in <top (required)>'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1771:in `each'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1148:in `block in foreach'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1299:in `open'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/csv.rb:1147:in `foreach'
lib/tasks/tasks.rake:322:in `block (2 levels) in <top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:250:in `block in execute'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:250:in `each'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:250:in `execute'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:194:in `block in invoke_with_call_chain'
/Users/Admin/.rvm/rubies/ruby-2.4.2/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:187:in `invoke_with_call_chain'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/task.rb:180:in `invoke'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:152:in `invoke_task'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:108:in `block (2 levels) in top_level'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:108:in `each'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:108:in `block in top_level'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:117:in `run_with_threads'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:102:in `top_level'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:80:in `block in run'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:178:in `standard_exception_handling'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/lib/rake/application.rb:77:in `run'
/Users/Admin/.rvm/gems/ruby-2.4.2@global/gems/rake-12.0.0/exe/rake:27:in `<top (required)>'
/Users/Admin/.rvm/rubies/ruby-2.4.2/bin/rake:23:in `load'
/Users/Admin/.rvm/rubies/ruby-2.4.2/bin/rake:23:in `<top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `load'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `kernel_load'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:28:in `run'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/cli.rb:424:in `exec'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/cli.rb:27:in `dispatch'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/cli.rb:18:in `start'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/exe/bundle:30:in `block in <top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/lib/bundler/friendly_errors.rb:122:in `with_friendly_errors'
/Users/Admin/.rvm/gems/ruby-2.4.2/gems/bundler-1.16.0/exe/bundle:22:in `<top (required)>'
/Users/Admin/.rvm/gems/ruby-2.4.2/bin/bundle:23:in `load'
/Users/Admin/.rvm/gems/ruby-2.4.2/bin/bundle:23:in `<main>'
/Users/Admin/.rvm/gems/ruby-2.4.2/bin/ruby_executable_hooks:15:in `eval'
/Users/Admin/.rvm/gems/ruby-2.4.2/bin/ruby_executable_hooks:15:in `<main>'
Tasks: TOP => synthea:census
Admin@Admins-MacBook:~/Public/synthea/synthea> 
jawalonoski commented 6 years ago

@nyquist212 The format of the income file may not be correct. Can you post the first 10 rows of the file?

The script is looking for a column in that file with the header GEO.display-label which gets translated into the ruby symbol :geodisplaylabel.... it seems that the script does not see that column.

e.g. it is null or nil

ghost commented 6 years ago

@jawalonoski

The column exists but I notice the town is missing. I only have County and State.

Presumably you're trying to substring this field into 3 parts and it's only finding two?

GEO.id | GEO.id2 | GEO.display-label | HC01_EST_VC01 | HC01_MOE_VC01 | HC02_EST_VC01 | HC02_MOE_VC01 | HC03_EST_VC01 | HC03_MOE_VC01 | HC04_EST_VC01 | HC04_MOE_VC01 | HC01_EST_VC02 | HC01_MOE_VC02 | HC02_EST_VC02 | HC02_MOE_VC02 | HC03_EST_VC02 | HC03_MOE_VC02 | HC04_EST_VC02 | HC04_MOE_VC02 | HC01_EST_VC03 | HC01_MOE_VC03 | HC02_EST_VC03 | HC02_MOE_VC03 | HC03_EST_VC03 | HC03_MOE_VC03 | HC04_EST_VC03 | HC04_MOE_VC03 | HC01_EST_VC04 | HC01_MOE_VC04 | HC02_EST_VC04 | HC02_MOE_VC04 | HC03_EST_VC04 | HC03_MOE_VC04 | HC04_EST_VC04 | HC04_MOE_VC04 | HC01_EST_VC05 | HC01_MOE_VC05 | HC02_EST_VC05 | HC02_MOE_VC05 | HC03_EST_VC05 | HC03_MOE_VC05 | HC04_EST_VC05 | HC04_MOE_VC05 | HC01_EST_VC06 | HC01_MOE_VC06 | HC02_EST_VC06 | HC02_MOE_VC06 | HC03_EST_VC06 | HC03_MOE_VC06 | HC04_EST_VC06 | HC04_MOE_VC06 | HC01_EST_VC07 | HC01_MOE_VC07 | HC02_EST_VC07 | HC02_MOE_VC07 | HC03_EST_VC07 | HC03_MOE_VC07 | HC04_EST_VC07 | HC04_MOE_VC07 | HC01_EST_VC08 | HC01_MOE_VC08 | HC02_EST_VC08 | HC02_MOE_VC08 | HC03_EST_VC08 | HC03_MOE_VC08 | HC04_EST_VC08 | HC04_MOE_VC08 | HC01_EST_VC09 | HC01_MOE_VC09 | HC02_EST_VC09 | HC02_MOE_VC09 | HC03_EST_VC09 | HC03_MOE_VC09 | HC04_EST_VC09 | HC04_MOE_VC09 | HC01_EST_VC10 | HC01_MOE_VC10 | HC02_EST_VC10 | HC02_MOE_VC10 | HC03_EST_VC10 | HC03_MOE_VC10 | HC04_EST_VC10 | HC04_MOE_VC10 | HC01_EST_VC11 | HC01_MOE_VC11 | HC02_EST_VC11 | HC02_MOE_VC11 | HC03_EST_VC11 | HC03_MOE_VC11 | HC04_EST_VC11 | HC04_MOE_VC11 | HC01_EST_VC13 | HC01_MOE_VC13 | HC02_EST_VC13 | HC02_MOE_VC13 | HC03_EST_VC13 | HC03_MOE_VC13 | HC04_EST_VC13 | HC04_MOE_VC13 | HC01_EST_VC15 | HC01_MOE_VC15 | HC02_EST_VC15 | HC02_MOE_VC15 | HC03_EST_VC15 | HC03_MOE_VC15 | HC04_EST_VC15 | HC04_MOE_VC15 | HC01_EST_VC18 | HC01_MOE_VC18 | HC02_EST_VC18 | HC02_MOE_VC18 | HC03_EST_VC18 | HC03_MOE_VC18 | HC04_EST_VC18 | HC04_MOE_VC18 | HC01_EST_VC19 | HC01_MOE_VC19 | HC02_EST_VC19 | HC02_MOE_VC19 | HC03_EST_VC19 | HC03_MOE_VC19 | HC04_EST_VC19 | HC04_MOE_VC19 | HC01_EST_VC20 | HC01_MOE_VC20 | HC02_EST_VC20 | HC02_MOE_VC20 | HC03_EST_VC20 | HC03_MOE_VC20 | HC04_EST_VC20 | HC04_MOE_VC20
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
Id | Id2 | Geography | Households; Estimate; Total | Households; Margin of Error; Total | Families; Estimate; Total | Families; Margin of Error; Total | Married-couple families; Estimate; Total | Married-couple families; Margin of Error; Total | Nonfamily households; Estimate; Total | Nonfamily households; Margin of Error; Total | Households; Estimate; Less than $10,000 | Households; Margin of Error; Less than $10,000 | Families; Estimate; Less than $10,000 | Families; Margin of Error; Less than $10,000 | Married-couple families; Estimate; Less than $10,000 | Married-couple families; Margin of Error; Less than $10,000 | Nonfamily households; Estimate; Less than $10,000 | Nonfamily households; Margin of Error; Less than $10,000 | Households; Estimate; $10,000 to $14,999 | Households; Margin of Error; $10,000 to $14,999 | Families; Estimate; $10,000 to $14,999 | Families; Margin of Error; $10,000 to $14,999 | Married-couple families; Estimate; $10,000 to $14,999 | Married-couple families; Margin of Error; $10,000 to $14,999 | Nonfamily households; Estimate; $10,000 to $14,999 | Nonfamily households; Margin of Error; $10,000 to $14,999 | Households; Estimate; $15,000 to $24,999 | Households; Margin of Error; $15,000 to $24,999 | Families; Estimate; $15,000 to $24,999 | Families; Margin of Error; $15,000 to $24,999 | Married-couple families; Estimate; $15,000 to $24,999 | Married-couple families; Margin of Error; $15,000 to $24,999 | Nonfamily households; Estimate; $15,000 to $24,999 | Nonfamily households; Margin of Error; $15,000 to $24,999 | Households; Estimate; $25,000 to $34,999 | Households; Margin of Error; $25,000 to $34,999 | Families; Estimate; $25,000 to $34,999 | Families; Margin of Error; $25,000 to $34,999 | Married-couple families; Estimate; $25,000 to $34,999 | Married-couple families; Margin of Error; $25,000 to $34,999 | Nonfamily households; Estimate; $25,000 to $34,999 | Nonfamily households; Margin of Error; $25,000 to $34,999 | Households; Estimate; $35,000 to $49,999 | Households; Margin of Error; $35,000 to $49,999 | Families; Estimate; $35,000 to $49,999 | Families; Margin of Error; $35,000 to $49,999 | Married-couple families; Estimate; $35,000 to $49,999 | Married-couple families; Margin of Error; $35,000 to $49,999 | Nonfamily households; Estimate; $35,000 to $49,999 | Nonfamily households; Margin of Error; $35,000 to $49,999 | Households; Estimate; $50,000 to $74,999 | Households; Margin of Error; $50,000 to $74,999 | Families; Estimate; $50,000 to $74,999 | Families; Margin of Error; $50,000 to $74,999 | Married-couple families; Estimate; $50,000 to $74,999 | Married-couple families; Margin of Error; $50,000 to $74,999 | Nonfamily households; Estimate; $50,000 to $74,999 | Nonfamily households; Margin of Error; $50,000 to $74,999 | Households; Estimate; $75,000 to $99,999 | Households; Margin of Error; $75,000 to $99,999 | Families; Estimate; $75,000 to $99,999 | Families; Margin of Error; $75,000 to $99,999 | Married-couple families; Estimate; $75,000 to $99,999 | Married-couple families; Margin of Error; $75,000 to $99,999 | Nonfamily households; Estimate; $75,000 to $99,999 | Nonfamily households; Margin of Error; $75,000 to $99,999 | Households; Estimate; $100,000 to $149,999 | Households; Margin of Error; $100,000 to $149,999 | Families; Estimate; $100,000 to $149,999 | Families; Margin of Error; $100,000 to $149,999 | Married-couple families; Estimate; $100,000 to $149,999 | Married-couple families; Margin of Error; $100,000 to $149,999 | Nonfamily households; Estimate; $100,000 to $149,999 | Nonfamily households; Margin of Error; $100,000 to $149,999 | Households; Estimate; $150,000 to $199,999 | Households; Margin of Error; $150,000 to $199,999 | Families; Estimate; $150,000 to $199,999 | Families; Margin of Error; $150,000 to $199,999 | Married-couple families; Estimate; $150,000 to $199,999 | Married-couple families; Margin of Error; $150,000 to $199,999 | Nonfamily households; Estimate; $150,000 to $199,999 | Nonfamily households; Margin of Error; $150,000 to $199,999 | Households; Estimate; $200,000 or more | Households; Margin of Error; $200,000 or more | Families; Estimate; $200,000 or more | Families; Margin of Error; $200,000 or more | Married-couple families; Estimate; $200,000 or more | Married-couple families; Margin of Error; $200,000 or more | Nonfamily households; Estimate; $200,000 or more | Nonfamily households; Margin of Error; $200,000 or more | Households; Estimate; Median income (dollars) | Households; Margin of Error; Median income (dollars) | Families; Estimate; Median income (dollars) | Families; Margin of Error; Median income (dollars) | Married-couple families; Estimate; Median income (dollars) | Married-couple families; Margin of Error; Median income (dollars) | Nonfamily households; Estimate; Median income (dollars) | Nonfamily households; Margin of Error; Median income (dollars) | Households; Estimate; Mean income (dollars) | Households; Margin of Error; Mean income (dollars) | Families; Estimate; Mean income (dollars) | Families; Margin of Error; Mean income (dollars) | Married-couple families; Estimate; Mean income (dollars) | Married-couple families; Margin of Error; Mean income (dollars) | Nonfamily households; Estimate; Mean income (dollars) | Nonfamily households; Margin of Error; Mean income (dollars) | Households; Estimate; PERCENT IMPUTED - Household income in the past 12   months | Households; Margin of Error; PERCENT IMPUTED - Household income in the   past 12 months | Families; Estimate; PERCENT IMPUTED - Household income in the past 12   months | Families; Margin of Error; PERCENT IMPUTED - Household income in the past   12 months | Married-couple families; Estimate; PERCENT IMPUTED - Household income in   the past 12 months | Married-couple families; Margin of Error; PERCENT IMPUTED - Household   income in the past 12 months | Nonfamily households; Estimate; PERCENT IMPUTED - Household income in the   past 12 months | Nonfamily households; Margin of Error; PERCENT IMPUTED - Household income   in the past 12 months | Households; Estimate; PERCENT IMPUTED - Family income in the past 12   months | Households; Margin of Error; PERCENT IMPUTED - Family income in the past   12 months | Families; Estimate; PERCENT IMPUTED - Family income in the past 12 months | Families; Margin of Error; PERCENT IMPUTED - Family income in the past 12   months | Married-couple families; Estimate; PERCENT IMPUTED - Family income in the   past 12 months | Married-couple families; Margin of Error; PERCENT IMPUTED - Family income   in the past 12 months | Nonfamily households; Estimate; PERCENT IMPUTED - Family income in the   past 12 months | Nonfamily households; Margin of Error; PERCENT IMPUTED - Family income in   the past 12 months | Households; Estimate; PERCENT IMPUTED - Nonfamily income in the past 12   months | Households; Margin of Error; PERCENT IMPUTED - Nonfamily income in the   past 12 months | Families; Estimate; PERCENT IMPUTED - Nonfamily income in the past 12   months | Families; Margin of Error; PERCENT IMPUTED - Nonfamily income in the past   12 months | Married-couple families; Estimate; PERCENT IMPUTED - Nonfamily income in   the past 12 months | Married-couple families; Margin of Error; PERCENT IMPUTED - Nonfamily   income in the past 12 months | Nonfamily households; Estimate; PERCENT IMPUTED - Nonfamily income in the   past 12 months | Nonfamily households; Margin of Error; PERCENT IMPUTED - Nonfamily income   in the past 12 months
0500000US46003 | 46003 | Aurora County, South Dakota | 1157 | 63 | 754 | 64 | 638 | 56 | 403 | 62 | 4.6 | 1.1 | 3.1 | 1.6 | 2 | 1.5 | 8.7 | 2.8 | 2.7 | 1.1 | 1.7 | 0.9 | 1.3 | 0.9 | 5.2 | 2.7 | 15 | 3.1 | 11.5 | 3.6 | 8.6 | 3.2 | 22.8 | 6.7 | 10.7 | 2.3 | 9.8 | 3.1 | 10.2 | 3.4 | 11.7 | 4 | 18.5 | 3.9 | 15.9 | 4.9 | 12.9 | 4.8 | 22.6 | 6.4 | 24.3 | 3.6 | 26.5 | 3.8 | 29.6 | 4.1 | 19.9 | 7.5 | 10.6 | 2.7 | 13.8 | 3.4 | 15.7 | 4 | 4 | 3.5 | 10.4 | 2.4 | 14.2 | 3 | 16.1 | 3.6 | 2.2 | 3 | 0.9 | 1 | 0.4 | 0.5 | 0.5 | 0.6 | 1.7 | 2.6 | 2.4 | 1.3 | 3.1 | 1.6 | 3.1 | 1.8 | 1.2 | 1.9 | 48750 | 3117 | 55395 | 3569 | (X) | (X) | 35855 | 6210 | 61548 | 4406 | 70605 | 5807 | N | N | 42407 | 7250 | 39 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 43.4 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 30.8 | (X)
0500000US46005 | 46005 | Beadle County, South Dakota | 7565 | 214 | 4738 | 275 | 3802 | 227 | 2827 | 283 | 5.4 | 1.9 | 2.3 | 2.1 | 0.2 | 0.2 | 10.7 | 3.8 | 7.5 | 1.9 | 5.2 | 2.4 | 2.7 | 1.8 | 11.6 | 3.4 | 13.4 | 2.8 | 11.4 | 3.3 | 8.3 | 3.6 | 18.2 | 5.2 | 13.1 | 2.6 | 8.6 | 3.3 | 9 | 3.9 | 20.6 | 4.7 | 14 | 2.6 | 11.7 | 2.7 | 10.8 | 2.4 | 17.3 | 4.9 | 20 | 3 | 23.3 | 4 | 24.1 | 4.2 | 14.6 | 4.3 | 13.4 | 2.3 | 17.9 | 3.4 | 21.8 | 4.1 | 4.7 | 2 | 9.4 | 2 | 14 | 3 | 16.4 | 3.6 | 1.3 | 1.4 | 2.1 | 0.7 | 2.8 | 1 | 3.4 | 1.3 | 0.9 | 0.7 | 1.8 | 1 | 2.9 | 1.6 | 3.3 | 1.9 | 0.1 | 0.2 | 46267 | 2209 | 62024 | 5123 | 71071 | 3204 | 30409 | 3161 | 58679 | 3235 | 71827 | 4613 | N | N | 35249 | 2534 | 31.5 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 32.8 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 29.4 | (X)
0500000US46007 | 46007 | Bennett County, South Dakota | 1050 | 75 | 762 | 70 | 439 | 61 | 288 | 51 | 8.6 | 2.7 | 7.1 | 2.8 | 4.1 | 3.1 | 13.5 | 5.9 | 6.2 | 2.6 | 5 | 2.7 | 3.2 | 2.8 | 10.8 | 7.1 | 14.1 | 4.4 | 11.8 | 5 | 3.9 | 2.2 | 20.5 | 6.8 | 13.8 | 3.7 | 12.5 | 4.3 | 9.1 | 4.8 | 19.4 | 7.7 | 16.9 | 3.7 | 22.6 | 5.6 | 22.1 | 7.4 | 9.4 | 4.5 | 21.7 | 4.3 | 20.6 | 4.9 | 23.2 | 6.4 | 18.1 | 8.4 | 7.1 | 2.5 | 6.7 | 2.8 | 11.4 | 5 | 6.6 | 4.4 | 8.7 | 3 | 10.4 | 3.6 | 17.1 | 5.8 | 0 | 6.7 | 1.4 | 1.1 | 1.3 | 1.3 | 2.3 | 2.4 | 1.7 | 2.4 | 1.5 | 1.9 | 2.1 | 2.6 | 3.6 | 4.5 | 0 | 6.7 | 42171 | 4116 | 46429 | 3285 | 57578 | 5596 | 26563 | 3387 | 54535 | 8832 | 59798 | 11593 | N | N | 34790 | 5105 | 41.6 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 47 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 25 | (X)
0500000US46009 | 46009 | Bon Homme County, South Dakota | 2480 | 112 | 1603 | 103 | 1365 | 97 | 877 | 100 | 6.1 | 1.9 | 2.2 | 1.4 | 1.5 | 1.2 | 14.6 | 5.3 | 7.9 | 2.2 | 1.7 | 1 | 1.5 | 1 | 19.3 | 5.7 | 11.2 | 2.6 | 8.5 | 2.3 | 8.9 | 2.8 | 15.7 | 5.2 | 14.3 | 2.9 | 15.8 | 3.8 | 11.9 | 3.3 | 14.3 | 4.5 | 14.9 | 2.4 | 11.4 | 2.6 | 10.5 | 2.7 | 19.3 | 4.7 | 18.9 | 2.7 | 23.6 | 3.7 | 25.2 | 4.2 | 10.1 | 3.3 | 11.3 | 1.9 | 15.8 | 3 | 16.8 | 3.3 | 2.5 | 2.2 | 11.3 | 1.8 | 14.8 | 3.1 | 17.2 | 3.6 | 3.4 | 2.7 | 1.8 | 0.9 | 2.8 | 1.4 | 2.9 | 1.4 | 0 | 2.3 | 2.3 | 1 | 3.1 | 1.4 | 3.6 | 1.7 | 0.8 | 1.2 | 45254 | 3268 | 60947 | 4078 | 64821 | 3163 | 25417 | 7119 | 61066 | 4445 | 75213 | 7155 | N | N | 33336 | 4748 | 34 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 39.5 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 23.6 | (X)
0500000US46011 | 46011 | Brookings County, South Dakota | 12325 | 284 | 7174 | 320 | 5601 | 306 | 5151 | 299 | 5.2 | 1.3 | 1.8 | 0.8 | 1 | 0.7 | 10.6 | 2.9 | 6.5 | 1.5 | 1.8 | 1.2 | 1.1 | 0.9 | 13.7 | 3.5 | 12.7 | 2.1 | 7.9 | 2.4 | 2.2 | 1 | 21.6 | 4.3 | 10.5 | 1.8 | 8.9 | 2 | 5.5 | 1.7 | 12.7 | 3 | 15 | 2 | 13.8 | 2.9 | 12.6 | 3.1 | 18.1 | 3.5 | 20.1 | 2.3 | 21.8 | 2.5 | 24.2 | 2.9 | 15.1 | 3.3 | 12.1 | 1.5 | 15.4 | 2.3 | 17.8 | 2.7 | 5.5 | 1.8 | 12.1 | 1.3 | 19.3 | 2.3 | 24.1 | 2.9 | 1.8 | 0.9 | 2.9 | 1 | 4.8 | 1.7 | 6.1 | 2.1 | 0 | 0.4 | 2.9 | 0.9 | 4.3 | 1.5 | 5.4 | 2 | 0.9 | 1.1 | 50082 | 2006 | 65591 | 5586 | 78776 | 5139 | 27309 | 4703 | 65907 | 4241 | 85440 | 7513 | N | N | 36253 | 2628 | 24 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 21.4 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 26.4 | (X)
0500000US46013 | 46013 | Brown County, South Dakota | 15996 | 292 | 9966 | 399 | 7908 | 353 | 6030 | 421 | 5.3 | 1.3 | 1.8 | 1.1 | 0.6 | 0.5 | 11.9 | 3 | 4.8 | 0.9 | 2 | 0.9 | 1.2 | 0.6 | 9.3 | 2.1 | 11.3 | 1.6 | 7.5 | 1.6 | 2.9 | 1.2 | 20.4 | 3.2 | 10.2 | 1.5 | 6.8 | 1.3 | 5.4 | 1.4 | 16 | 3.1 | 14.3 | 1.9 | 13.3 | 2.2 | 11.7 | 2 | 18.1 | 3.4 | 20.9 | 2.2 | 23.4 | 2.9 | 24 | 2.8 | 14.4 | 2.7 | 15.2 | 1.5 | 20.2 | 2.2 | 24.2 | 2.5 | 5 | 1.5 | 12.5 | 1.6 | 17.7 | 2.6 | 21.2 | 3 | 3.4 | 1.5 | 1.9 | 0.5 | 2.6 | 0.8 | 3.1 | 1.1 | 0.5 | 0.4 | 3.7 | 1 | 4.7 | 1.3 | 5.7 | 1.6 | 1.1 | 0.8 | 53100 | 2170 | 68685 | 3620 | 78413 | 2140 | 30617 | 1477 | 67609 | 2944 | 81945 | 4213 | N | N | 39027 | 2975 | 29.4 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 29.5 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 27.5 | (X)
0500000US46015 | 46015 | Brule County, South Dakota | 2067 | 81 | 1485 | 86 | 1169 | 79 | 582 | 77 | 6.2 | 2.5 | 4.8 | 2.9 | 2.1 | 2.7 | 11.3 | 4.4 | 5.8 | 2.7 | 2.9 | 2.7 | 3.3 | 3.4 | 13.1 | 6 | 9 | 2.2 | 7.4 | 2.7 | 7.3 | 3.2 | 12.9 | 4.1 | 13.2 | 2.8 | 11.6 | 3.5 | 8.1 | 3.3 | 19.9 | 6.1 | 16.1 | 3.4 | 16.4 | 4.1 | 13.7 | 4 | 16.5 | 5.9 | 26.7 | 3.3 | 28.9 | 4.7 | 31.9 | 5.3 | 16 | 6.8 | 11.3 | 2.4 | 13.7 | 3.2 | 16 | 3.8 | 4.8 | 3.2 | 7.2 | 2.2 | 9 | 3 | 10.9 | 3.8 | 2.7 | 2.4 | 2.5 | 1.3 | 3.1 | 1.8 | 3.9 | 2.3 | 1 | 1.6 | 2.1 | 1.2 | 2.2 | 1.5 | 2.7 | 1.8 | 1.7 | 2 | 49531 | 5291 | 56646 | 3348 | 60457 | 4497 | 30921 | 4316 | 59339 | 5392 | 65104 | 7577 | N | N | 41175 | 6457 | 34.5 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 36.6 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 27.7 | (X)
0500000US46017 | 46017 | Buffalo County, South Dakota | 546 | 49 | 435 | 38 | 216 | 31 | 111 | 33 | 14.3 | 4.2 | 15.2 | 5.2 | 8.3 | 5.5 | 17.1 | 10.8 | 12.5 | 3.9 | 10.8 | 4.5 | 10.2 | 6.4 | 21.6 | 11.5 | 13.6 | 4.3 | 14.5 | 4.7 | 10.6 | 5.7 | 7.2 | 7.1 | 15.4 | 4.5 | 14.3 | 4.8 | 11.6 | 6.8 | 22.5 | 11.7 | 12.1 | 4.2 | 9.7 | 3.8 | 12 | 5.5 | 18 | 11.4 | 20.5 | 5 | 25.1 | 6.4 | 30.6 | 9.7 | 6.3 | 6.7 | 5.9 | 2.9 | 5.7 | 3 | 9.7 | 4.6 | 0 | 16.4 | 4.9 | 3 | 3.7 | 2.7 | 4.6 | 3.4 | 7.2 | 8.7 | 0.9 | 1 | 1.1 | 1.2 | 2.3 | 2.5 | 0 | 16.4 | 0 | 3.6 | 0 | 4.5 | 0 | 8.8 | 0 | 16.4 | 31163 | 1883 | 31458 | 3277 | (X) | (X) | 27679 | 13312 | 40141 | 4096 | 40295 | 4433 | N | N | 31073 | 9005 | 36.1 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 34 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 38.7 | (X)
0500000US46019 | 46019 | Butte County, South Dakota | 4079 | 158 | 2548 | 231 | 2127 | 229 | 1531 | 221 | 5.7 | 2.7 | 1.4 | 1.4 | 0.2 | 0.3 | 13.3 | 6.2 | 11.4 | 3.3 | 4.5 | 2.9 | 3.1 | 2.3 | 23 | 6.7 | 13 | 4 | 9.8 | 5.3 | 8.7 | 5.8 | 18.7 | 6.5 | 12.2 | 3.2 | 12.6 | 3.7 | 9.8 | 3.2 | 11.8 | 5.6 | 12.1 | 3.1 | 11.7 | 3.9 | 11.2 | 4 | 12.6 | 5.6 | 20.7 | 4.5 | 25.6 | 6 | 29 | 6.6 | 12.1 | 6 | 13.2 | 3.4 | 17.1 | 4.8 | 20.5 | 5.6 | 6.5 | 5.3 | 6.9 | 2.4 | 9.7 | 3.2 | 9.9 | 3.5 | 2 | 2 | 1.5 | 1 | 2.5 | 1.6 | 3 | 1.9 | 0 | 1.3 | 3.2 | 1.9 | 5.1 | 3.1 | 4.6 | 3.2 | 0 | 1.3 | 41920 | 7479 | 58026 | 5407 | (X) | (X) | 21426 | 5007 | 56014 | 5813 | 70831 | 8318 | N | N | 30257 | 4621 | 28.2 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 27.4 | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | (X) | 29.1 | (X)

BTW, you've done fantastic work on this project. We're all very impressed over here on my team.

ACS_15_5YR_S1901_with_ann.csv.txt

jawalonoski commented 6 years ago

You will need the data by town, not just county. It is expecting something like this: Hartford township, Adams County, Indiana

Isn't expecting 3 items though, it is just looking at the first... so I'm not sure what is going on.

jawalonoski commented 6 years ago

@nyquist212 Can you post all four files in a zip? I can try to debug.

ghost commented 6 years ago

I pulled data for 5 states and they all appear to be missing the town from the GEO.display-label field.

I'm wondering if this is a user error on my part when selecting filters and extracting data from census.org.

IA.zip MN.zip ND.zip NE.zip SD.zip

jawalonoski commented 6 years ago

@nyquist212 So, I just tried the SD.zip file and got it to work without errors.

Changes in ./lib/tasks/tasks.rake required:

--- a/lib/tasks/tasks.rake
+++ b/lib/tasks/tasks.rake
@@ -271,7 +271,7 @@ namespace :synthea do
     options = { :headers => true, :header_converters => :symbol }
     towns = {}
     counties = {}
-    townfile = File.open('./resources/SUB-EST2015_18.csv', 'r:UTF-8')
+    townfile = File.open('./resources/SD/SUB-EST2016_46_SD.csv', 'r:UTF-8')
     CSV.foreach(townfile, options) do |row|
       if row[:primgeo_flag].to_i == 1 && row[:funcstat] == 'A'
         town_name = row[:name].split.keep_if { |x| !%w(town township city CDP (balance) (pt.)).include?(x.downcase) }.join(' ')
@@ -290,7 +290,7 @@ namespace :synthea do
     end
     townfile.close
     ageGroups = ['Total', (0..4), (5..9), (10..14), (15..19), (20..24), (25..29), (30..34), (35..39), (40..44), (45..49), (50..54), (55..59), (60..64), (65..69), (70..74), (75..79), (80..84), (85..110)]
-    countyfile = File.open('./resources/CC-EST2015-ALLDATA-18.csv', 'r:UTF-8')
+    countyfile = File.open('./resources/SD/CC-EST2016-ALLDATA-46_SD.csv', 'r:UTF-8')
     CSV.foreach(countyfile, options) do |row|
       # if (2015 estimate) && (total overall demographics)
       if row[:year].to_i == 8 && row[:agegrp].to_i.zero?
@@ -322,7 +322,7 @@ namespace :synthea do
     end
     countyfile.close

-    incomefile = File.open('./resources/ACS_14_5YR_S1901_ann.csv', 'r:UTF-8')
+    incomefile = File.open('./resources/SD/ACS_15_5YR_S1901_with_ann.csv', 'r:UTF-8')
     CSV.foreach(incomefile, options) do |row|
       next if row[:geoid] == 'Id' # this CSV has 2 header rows
       next if row[:geodisplaylabel].include?('not defined')
@@ -347,7 +347,7 @@ namespace :synthea do
     end
     incomefile.close

-    educationfile = File.open('./resources/ACS_14_5YR_S1501_ann.csv', 'r:UTF-8')
+    educationfile = File.open('./resources/SD/ACS_15_5YR_S1501_with_ann.csv', 'r:UTF-8')
     CSV.foreach(educationfile, options) do |row|
       next if row[:geoid] == 'Id' # this CSV has 2 header rows
       next if row[:geodisplaylabel].include?('not defined')

Issues:

I may end up generating all of the configuration files for every state, since folks seem to have trouble with this.

ghost commented 6 years ago

That would be enormously helpful. For posterity, this is the source for ALL of the US State Population data.

"subcounty population estimates for towns and cities" https://www2.census.gov/programs-surveys/popest/datasets/2010-2015/cities/totals/

"county population estimates by age, gender, race, ethnicity" https://www2.census.gov/programs-surveys/popest/datasets/2010-2015/counties/asrh/

I suspect the education and income data could come from the API https://api.census.gov/data/2015/acs/acs5.html https://api.census.gov/data/2015/acs/acs5/variables.html

jawalonoski commented 6 years ago

This should be fixed with PR #245