CartoDB / observatory-extension

BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

OBSGetmeta parameters are not working correctly #328

Open ethervoid opened 6 years ago

ethervoid commented 6 years ago

@michellemho ran into unexpected results while exploring metadata using CARTOframes the CARTOframes query was: cc.data_discovery('United States', keywords='Median Income', time='2010 - 2014').T under the hood, the SQL query is:

WITH envelope AS (
SELECT ST_MakeEnvelope(-180.0, -85.0, 180.0, 85.0, 4326) AS env, 500::int AS cnt
), numers AS (
SELECT
numer_id,
null AS geom_id,
'2010 - 2014' AS numer_timespan,
'predenominated' AS normalization
FROM
OBS_GetAvailableNumerators(
(SELECT env FROM envelope),
'{section/tags.united_states}',
null,
null,
'2010 - 2014')
WHERE ((numer_description ILIKE '%Median Income%' OR numer_name ILIKE '%Median Income%')) AND (valid_timespan)
UNION
SELECT
numer_id,
null AS geom_id,
'2010 - 2014' AS numer_timespan,
null AS normalization
FROM
OBS_GetAvailableNumerators(
(SELECT env FROM envelope),
'{section/tags.united_states}',
null,
null,
'2010 - 2014')
WHERE ((numer_description ILIKE '%Median Income%' OR numer_name ILIKE '%Median Income%')) AND (valid_timespan)
)
SELECT *
FROM json_to_recordset(
(SELECT OBS_GetMeta(
envelope.env,
json_agg(numers),
10, 10, envelope.cnt
) AS meta
FROM numers, envelope
GROUP BY env, cnt)) as data(
denom_aggregate text, denom_colname text,
denom_description text, denom_geomref_colname text,
denom_id text, denom_name text, denom_reltype text,
denom_t_description text, denom_tablename text,
denom_type text, geom_colname text, geom_description text,
geom_geomref_colname text, geom_id text, geom_name text,
geom_t_description text, geom_tablename text,
geom_timespan text, geom_type text, id numeric,
max_score_rank text, max_timespan_rank text,
normalization text, num_geoms numeric, numer_aggregate text,
numer_colname text, numer_description text,
numer_geomref_colname text, numer_id text,
numer_name text, numer_t_description text,
numer_tablename text, numer_timespan text,
numer_type text, score numeric, score_rank numeric,
score_rownum numeric, suggested_name text,
target_area text, target_geoms text, timespan_rank numeric,
timespan_rownum numeric)
WHERE numer_aggregate <> 'quantile' 

The result is unexpected for two reasons: 1) the only geoms are Core Based Statistical Area (CBSA)... but there should be more geom levels (census tract, census block group... etc.) 2) the two rows returned are identical, except for normalization. One has predenominated and the other has prenormalized... this is unexpected. In fact, I'm not sure what the difference is for these terms. I thought the only valid normalization values in the metadata output were area, predenominated, or denominated?

update: I figured out a reason for the first problem. It's in the query OBS_GetMeta(envelope.env, json_agg(numers), 10, 10, envelope.cnt) ... envelope.cnt is set to 500; so our query is matching the boundary with close to this many objects within the extent... However, the arguments num_timespan_options and num_score_options (10 and 10 in this query) don't seem to be doing anything.