NYCPlanning / db-factfinder

data ETL for population fact finder (decennial + acs)
https://nycplanning.github.io/db-factfinder/factfinder/
MIT License
2 stars 3 forks source link

aggregated_geography E, M match but P Z mismatch #73

Closed SPTKL closed 3 years ago

SPTKL commented 3 years ago

Update: mostly resolved, still remaining variables: hh5, pop_6

{
        "pff_variable": "pop_6",
        "base_variable": "percapinc",
        "census_variable": [
            "B01001_001"
        ],
        "domain": "economic",
        "rounding": 0,
        "source": ""
    },
{
        "pff_variable": "hh5",
        "base_variable": "mnhhinc",
        "census_variable": [
            "DP03_0051"
        ],
        "domain": "economic",
        "rounding": 0,
        "source": "profile"
    },
select
    *
from (
select 
    a.census_geoid,
    a.geoid,
    a.pff_variable,
    a.c, a.e, a.m, a.p, a.z,
    b.c as _c, b.e as _e, b.m as _m, b.p as _p, b.z as _z
from (
    select *, a.census_geoid as geoid
    from pff_acs."2018-test" a
    where a.geotype = 'NTA'
) a
join pff_acs."2018" b
on a.pff_variable = lower(b.variable)
and a.geoid = b.geoid
and b.geotype = 'NTA2010'
) a
where abs(e - _e) > 1 or abs(m - _m) > 1 or abs(p - _p) > 0.1 or abs(z - _z) > 0.1
SPTKL commented 3 years ago

sometimes the base is a median variable

SPTKL commented 3 years ago

Update: mostly resolved, still remaining variables: hh5, pop_6

mgraber commented 3 years ago

To fully resolve, do not calculate P and Z for not-in-profile variables. See #77 and #41.

mgraber commented 3 years ago

Will get resolved by #77