CartoDB / cartodb

Location Intelligence & Data Visualization tool
http://carto.com
BSD 3-Clause "New" or "Revised" License
2.76k stars 651 forks source link

Make DO multiple measure auto-select best geometry #12623

Closed javitonino closed 7 years ago

javitonino commented 7 years ago

DO has a quite interesting system to automatically detect the best geometry for a given dataset and augmentation. Although Builder has this information, it is not currently doing anything with it. It would be great if we could make the geometry that DO considers the best to be the default in the UI.

When asking for the list of geometries in a multiple measure analysis, DO returns something like

[
        {
            "geom_id": "us.census.tiger.county_clipped",
            "geom_name": "Shoreline clipped US County",
            "geom_weight": "5.010000000000000000208166817",
            "score": 48.9005821201008,
            [...]
        },
        {
            "geom_id": "us.census.tiger.puma_clipped",
            "geom_name": "Shoreline clipped US Census Public Use Microdata Areas",
            "geom_weight": "4.010000000000000000208166817",
            "score": 40.4863299235498,
            [...]
        },
        {
            "geom_id": "us.census.tiger.congressional_district_clipped",
            "geom_name": "Shoreline clipped US Congressional Districts",
            "geom_weight": "2.010000000000000000208166817",
            "score": 30.973792235478125,
            [...]
        },
        {
            "geom_id": "us.census.tiger.state_clipped",
            "geom_name": "Shoreline clipped US States",
            "geom_weight": "1.010000000000000000208166817",
            "score": 26.004888297362797,
            [...]
        }
    ]

Builder uses the weight field to order the geometries in the slider component. score is the result of the auto-detection algorithm, the highest score, the best. So, in this example, the county-level geometry should be selected as a default.

cc @noguerol

noguerol commented 7 years ago

+1

juanignaciosl commented 7 years ago

As Charlie team handles DO stuff and now we have frontenders, I think that we can get this. @noguerol , could you think about when should this be prioritized, please?

PS: it's related to CartoDB/observatory-extension/issues/309

ethervoid commented 7 years ago

Acceptance

You need to have the new observatory and DS version installed: https://github.com/CartoDB/observatory-extension/issues/309#issuecomment-335730527

Then you have to tests with some different datasets in order to see what is going to pre-select. The way to check which is the first scored geometry is to run this query through SQL API:

http://{{user}}.carto-staging.com/api/v2/sql?q=WITH _data as (SELECT * FROM OBS_GetAvailableGeometries(bounds => (SELECT ST_SetSRID(ST_Extent(the_geom), 4326) FROM (SELECT * FROM {{table_name}}) q), numer_id => '{{numer_id}}', number_geometries => (SELECT CDB_EstimateRowCount('SELECT * FROM {{table_name}}')::INTEGER), timespan => '{{timespan}}') denoms WHERE valid_numer IS TRUE AND valid_timespan = TRUE) SELECT *, rank() OVER (ORDER BY score DESC) FROM _data WHERE CASE WHEN EXISTS(SELECT 1 as cond FROM _data WHERE geom_tags ?'boundary_type/tags.interpolation_boundary' group by cond) THEN geom_tags ? 'boundary_type/tags.interpolation_boundary' ELSE true END ORDER BY geom_weight::numeric DESC;&api_key={{api_key}}

For it you need:

Once you run the query you'll see that the results come with a rank, the number one would be the pre-selected value in the geometries slider.

Also, we have to test that if we change the value it'll remain selected the next time we edit the analysis.