timwis / vizwit

An interactive data visualization tool
http://vizwit.io
GNU General Public License v2.0
100 stars 35 forks source link

CartoDB Provider #171

Closed timwis closed 7 years ago

timwis commented 8 years ago

The work was done by @chriswhong and @talos, I just patched their changes to accommodate #149

timwis commented 8 years ago

Fixed offset, export, record count, and full text search. Should be ready to merge, just wanna test how choropleth map would work.

So far, this is one way to get it to work similar to how they're working on socrata:

SELECT boundaries.cartodb_id, COUNT(incidents.*)
FROM
    crimes_2015_to_oct_2016 as incidents,
    police_district_boundaries as boundaries
WHERE
    ST_Within(incidents.the_geom, boundaries.the_geom)
GROUP BY boundaries.cartodb_id

Here's an even more precise one from IRC.

Alternatively, you can create dataset from query after running something like:

SELECT incidents.*, boundaries.cartodb_id AS boundary_id
FROM
    crimes_2015_to_oct_2016 as incidents,
    police_district_boundaries as boundaries
WHERE
    ST_Within(incidents.the_geom, boundaries.the_geom)

But I'm getting an error Invalid cartodb_id column (2011) when I try to do this. Would the latter approach even keep up to date with the source dataset?

timwis commented 8 years ago

Okay I could use some advice on this one. The way vizwit is designed to work is: the dataset is meant to have a column relating to the boundary, ie. "police district objectid." In socrata, these columns are added magically when you create datalens pages (denormalised onto the dataset as system fields). Theoretically, if you had something simpler like CKAN datastore you could compute these yourself and add them to the dataset. So vizwit does a query like SELECT boundary_id, count(*) FROM dataset GROUP BY boundary_id, then fetches the geojson and does a join on the boundary_id.

Carto may be able to make this simpler, resulting in a single request since it could do the join and return the boundaries geojson, but it would involve a much larger refactor to accommodate that (not to mention that no other datastore will do that, so it'd be a provider-specific map card).

Thoughts? /cc @chriswhong @talos

chriswhong commented 8 years ago

If you are doing to join client side it is the same amount of work to get the nonspatial data from carto and do a 2nd API call to get the geometries.

Carto does allow you to get everything in one call to the API but nothing else can replicate that so to me it makes sense to just do the join client side.

-C

On Sunday, October 30, 2016, Tim Wisniewski notifications@github.com wrote:

Okay I could use some advice on this one. The way vizwit is designed to work is: the dataset is meant to have a column relating to the boundary, ie. "police district objectid." In socrata, these columns are added magically when you create datalens pages (denormalised onto the dataset as system fields). Theoretically, if you had something simpler like CKAN datastore you could compute these yourself and add them to the dataset. So vizwit does a query like SELECT boundary_id, count(*) FROM dataset GROUP BY boundary_id, then fetches the geojson and does a join on the boundary_id.

Carto may be able to make this simpler, resulting in a single request since it could do the join and return the boundaries geojson, but it would involve a much larger refactor to accommodate that (not to mention that no other datastore will do that, so it'd be a provider-specific map card).

Thoughts? /cc @chriswhong https://github.com/chriswhong @talos https://github.com/talos

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/timwis/vizwit/pull/171#issuecomment-257167994, or mute the thread https://github.com/notifications/unsubscribe-auth/ABv7XI9VSYBJsvfrZ7s8ZmMOgp6zJO1Aks5q5N57gaJpZM4Hy8Jt .

timwis commented 8 years ago

Progress so far: http://vizwit-cartodb.surge.sh/?gist=39c8e1fd5639db1f752298baf3450fb4

EDIT: oh shoot. just got it working with 3 join properties in the config for the map card and realised that cross-filtering won’t work. when you click a polygon in the map, it tries to filter all the other charts by adding the filter where (boundaries.cartodb_id = 17), but all the other charts don’t have the join on them … so you’d need to add the joining to every single card to get the effect

Anyone know off-hand if carto’s “create dataset from query” functionality keeps the new dataset up to date with the source dataset(s)?