CoryMcCartan / blockpop

Estimate Census Block Populations for 2020
https://corymccartan.github.io/blockpop
GNU General Public License v3.0
3 stars 0 forks source link

Harmonizing erroring out to duplicates, after second try #1

Closed kuriwaki closed 3 years ago

kuriwaki commented 3 years ago

I am following your main example with state = "SC". The individual components look good at first glance but I am getting an error at harmonize_variables; see below. I don't know why but you can also see the error message changes the second time I run the same command, with the DT2 incrementing to DT3, which I think is related to the duplication error.

> block_d
# A tibble: 181,908 x 4
   state block           pop2010 pop2020
   <fct> <chr>             <dbl>   <dbl>
 1 SC    450019501001000       0    0   
 2 SC    450019501001001      15   14.7 
 3 SC    450019501001002       2    1.97
 4 SC    450019501001003       2    1.97
 5 SC    450019501001004      20   19.7 
 6 SC    450019501001005       2    3.08
 7 SC    450019501001006       6    5.90
 8 SC    450019501001007       0    0   
 9 SC    450019501001008       0    0   
10 SC    450019501001009       0    0   
# … with 181,898 more rows
> census_d
# A tibble: 181,908 x 19
   block      pop pop_hisp pop_white pop_black pop_aian pop_asian pop_nhpi pop_other pop_two   vap vap_hisp
   <chr>    <dbl>    <dbl>     <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl>   <dbl> <dbl>    <dbl>
 1 4500195…     0        0         0         0        0         0        0         0       0     0        0
 2 4500195…     0        0         0         0        0         0        0         0       0     0        0
 3 4500195…    15        0        15         0        0         0        0         0       0     9        0
 4 4500195…    20        0        20         0        0         0        0         0       0    14        0
 5 4500195…     0        0         0         0        0         0        0         0       0     0        0
 6 4500195…     2        0         2         0        0         0        0         0       0     2        0
 7 4500195…     2        0         2         0        0         0        0         0       0     2        0
 8 4500195…     0        0         0         0        0         0        0         0       0     0        0
 9 4500195…     2        0         2         0        0         0        0         0       0     2        0
10 4500195…     0        0         0         0        0         0        0         0       0     0        0
# … with 181,898 more rows, and 7 more variables: vap_white <dbl>, vap_black <dbl>, vap_aian <dbl>,
#   vap_asian <dbl>, vap_nhpi <dbl>, vap_other <dbl>, vap_two <dbl>
> acs_d
# A tibble: 3,059 x 10
   bgroup         pop pop_hisp pop_white pop_black pop_aian pop_asian pop_nhpi pop_other pop_two
   <chr>        <dbl>    <dbl>     <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl>   <dbl>
 1 450870307002   732        0       505       227        0         0        0         0       0
 2 450870307003  1325        0       963       306        0         0        0         0      56
 3 450870307001  1215       10       974       224        0         0        0         7       0
 4 450870301001   840       61       501       251        0         0        0         0      27
 5 450870301002  1234        0       590       558        7         0        0         0      79
 6 450870308002   710        0       468       223        0         0        0         0      19
 7 450870308004   668        0       338       330        0         0        0         0       0
 8 450870308003   986        0       731       184        0         0        0         0      71
 9 450870306001   894        0       407       479        0         4        0         0       4
10 450870304004   483        0       425        58        0         0        0         0       0
# … with 3,049 more rows
> harmonize_vars(block_d, census_d, acs_d)
ℹ Joining tables.
ℹ Harmonizing counts.
Error in `[.data.table`(copy(`_DT3`)[, `:=`(tract = str_sub(block, 1,  : 
  Can't assign to the same column twice in the same query (duplicates detected).
> harmonize_vars(block_d, census_d, acs_d)
ℹ Joining tables.
ℹ Harmonizing counts.
Error in `[.data.table`(copy(`_DT4`)[, `:=`(tract = str_sub(block, 1,  : 
  Can't assign to the same column twice in the same query (duplicates detected).

also, btw, the first time I used this it said it could not find the function fcoalesce so I had to load the package data.table. Maybe an implicit dependency?

CoryMcCartan commented 3 years ago

Thanks for catching! The fcoalesce issue seems to be with dtplyr but can be fixed with an import.

And the duplication error should be fixed in the most recent commit.