UDST / synthpop

Synthetic populations from census data
BSD 3-Clause "New" or "Revised" License
99 stars 47 forks source link

specify the columns to get from pums #35

Closed Eh2406 closed 7 years ago

Eh2406 commented 7 years ago

This massively reduced the memory use of our recipe.

looks good to go, thoughts?

Eh2406 commented 7 years ago

test fixed

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.1%) to 81.342% when pulling 04e3de2c710636e9321eebe2436148f76d995ad9 on SEMCOG:usecols into 554788a4a78f1cf1f3601407206aea0e4456ceb5 on UDST:master.

Eh2406 commented 7 years ago

@janowicz thoughts? Do you want me to squash the commits? Do you know why testing starter2 works locally but not on travis?

janowicz commented 7 years ago

Thanks @Eh2406! Happy to hear about the reduced memory use. I'll get a chance to review this and look into those questions later in the week

Eh2406 commented 7 years ago

Did you have a chance to look at this?

janowicz commented 7 years ago

This is a nice change :+1: . Thanks @Eh2406. I observed lower memory usage here too.
One thing users should note is that if they're relying on a particular PUMS variable in the output that is not in h_pums_cols / p_pums_cols, then they should add that variable to the *_pums_cols tuple (or join it to the output).

The starter2 recipe runs locally for me too. I think backing out the starter2 test is fine- it can take awhile to run and is just an alternative recipe example. The main starter is more appropriate for a test.

Eh2406 commented 7 years ago

Do you want me to squash the commits? How can we make the use of *_pums_cols clearer? What else should be done before merge?

janowicz commented 7 years ago

No need to squash. I added a comment about *_pums_cols. I think its ready to merge, thanks!

janowicz commented 7 years ago

The failed travis tests are related to https://github.com/UDST/synthpop/issues/36 which will be resolved separately.