CartoDB / observatory-extension

BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

GetAvailableGeometries should be filtered by timespan #297

Closed ethervoid closed 7 years ago

ethervoid commented 7 years ago

We made a "fix" to make able to have timespan in the UI.

The problem with this solution is that we're showing geometries that don't have results for the selected numerator and timespan so we need to be able to filter by timespan.

For example if we use the Total population numerator, the ZIP codes geometry and the '2015 - 2015' timespan, we're going to get all NULLs because we don't have data for that timespan/geometry:

select numer_id, numer_timespan from obs_meta where numer_id = 'us.census.acs.B01003001' and geom_id = 'us.census.tiger.zcta5_clipped';
        numer_id         | numer_timespan
-------------------------+----------------
 us.census.acs.B01003001 | 2011 - 2015
 us.census.acs.B01003001 | 2010 - 2014
(2 rows)

My proposal is to change the OBS_GetAvailableGeometries function and make it "smart" enough to be able to filter using the geometries timespan and the numerators timespan. Why? Because for example the OBS_GetAvailableTimspans returns the numerators timespans and if we, for example in our UI, want to filter geometries with data for certain timespan we need to filter for the two of them (geometries / numerators)

ethervoid commented 7 years ago

I managed to make a "smart" query to filter by numerator and/or geometry timespan.

This query takes into account if we're passing a numerator_id and filter the geometries using either the numerator or the geometry timespan.

But I reallized that we have a mismatch between obs_meta and obs_meta_timespan. In obs_meta which is used by the OBS_GetMeta function we get 9 rows:

dataservices_db=# select numer_id, numer_timespan, geom_id from observatory.obs_meta where numer_id = 'us.census.acs.B01003001' and numer_timespan = '2015 - 2015' and geom_id ilike '%_clipped';
        numer_id         | numer_timespan |                      geom_id
-------------------------+----------------+----------------------------------------------------
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.school_district_elementary_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.puma_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.state_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.place_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.school_district_secondary_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.cbsa_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.congressional_district_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.county_clipped
 us.census.acs.B01003001 | 2015 - 2015    | us.census.tiger.school_district_unified_clipped
(9 rows)

but if we check for the same data using the obs_meta_* tables, we get 12 rows:

dataservices_db=# SELECT geom_id, o.timespans timespan_geom, ot.timespan_id timespan_timespan FROM observatory.obs_meta_geom o INNER JOIN obs_meta_timespan ot ON timespan_id = '2015 - 2015' WHERE '2015 - 2015' = ANY(o.timespans) OR '2015 - 2015' = ot.timespan_id AND geom_id ilike 'us.%_clipped';
                      geom_id                       | timespan_geom | timespan_timespan
----------------------------------------------------+---------------+-------------------
 us.census.tiger.state_clipped                      | {2014}        | 2015 - 2015
 us.census.tiger.place_clipped                      | {2014}        | 2015 - 2015
 us.census.tiger.school_district_secondary_clipped  | {2014}        | 2015 - 2015
 us.census.tiger.census_tract_clipped               | {2014}        | 2015 - 2015
 us.census.tiger.county_clipped                     | {2014}        | 2015 - 2015
 us.census.tiger.zcta5_clipped                      | {2014}        | 2015 - 2015
 us.census.tiger.block_group_clipped                | {2014}        | 2015 - 2015
 us.census.tiger.cbsa_clipped                       | {2014}        | 2015 - 2015
 us.census.tiger.school_district_elementary_clipped | {2014}        | 2015 - 2015
 us.census.tiger.puma_clipped                       | {2014}        | 2015 - 2015
 us.census.tiger.congressional_district_clipped     | {2014}        | 2015 - 2015
 us.census.tiger.school_district_unified_clipped    | {2014}        | 2015 - 2015
(12 rows)

So presumibably at some points we're doing something wrong

ethervoid commented 7 years ago

After looking for the best solution for this problem. I've ended up modifying the meta data and creating a new table that links the geometries with the numerators and their timespans. You can see it here

With this new table, we're able to filter the geometries using either the geometries timespan or the numerators timespans.

ethervoid commented 7 years ago

PR #302

ethervoid commented 7 years ago

Blocked until we deploy all the related stuff with the new DO