CartoDB / Windshaft-cartodb

Windshaft tailored for CARTO
BSD 3-Clause "New" or "Revised" License
72 stars 59 forks source link

Implement Metadata as needed by CARTO VL #948

Closed jgoizueta closed 6 years ago

jgoizueta commented 6 years ago

CartoVL is using the SQL API to obtain metadata about the dataset/query used, including a sample of the data. We must avoid using the SQL API from CartoVL and obtain this data from the Maps API to avoid requiring both Maps API and SQL API authorization keys.

At this point we want to implement quickly what cartovl needs, and eventually refactor it into something more reasonable and efficient.

We could just implement now an ad-hoc endpoint performing the exact same queries/data processing we do now in CartoVL.

Or, if we consider the effort will be similar (I'm inclined to think so), to implement it as optional metadata returned by the map instantiation. This will offer opportunities for optimization (now or later) since some of the metadata may be already computed/used, we save requests and could also save queries by combining requested metadata with e.g. the needs of getAggregationMetadata).

We could add a parameter to request metadata, e.g. "metadata": { sample: true, rowCount: true, columnStats: true } and the data could be added to the existing metadata.stats in the response. This could be nicely encapsulated in the setLayerStats function of the Windshaft-cartodb maps controller.

Details

What CartoVL does now

All the metadata CartoVL requests now is actually needed. Tne windshaft module encapsulates all SQL API requests through getSQL, which is used by the next functions which are called to prepare the metadata (in _getMetadata):

In addition to the metadata (categoryIDs, columns, featureCount, sample). geomType is kept in windshaft object, used to deterimine if aggregation is possible and to decode MVT.

What the tiler already does at instantiation

The module query-utils of Windshaft-cartodb contains some functions to fetch metadata about the query. In particular a function getAggregationMetadata used to determined if aggregation should be applied which returns:

When default aggregation is used (sampling) the columns of the original query are obtained with a LIMIT 0 query (in getLayerAggregationColumns) to set the columns layer parameter.

The map instantiation response contains layergroup.metadata.layers[0].meta.stats.estimatedFeatureCount (which could be extended for additional metadata). It also contains layergroup.metadata.layers[0].meta.stats.aggregation which could be used for aggregated stats at some point.

jgoizueta commented 6 years ago

Whether it's a good idea to implement the metadata request in the map instantiation endpoint(s), or add a new specific endpoint.

Then we can start with the implementatin; other details can be decided later/after some experimentation:

jgoizueta commented 6 years ago

Experimental map instantiation metadata is now available in #952

But there's a problem with returning metadata at instantiation and how we use it now at the client; Carto VL is using metadata for these two details of the instantiation:

Possible solutions

Note that A and B are modifications of the Maps API. C and D involve only Carto VL changes

jgoizueta commented 6 years ago

Since MVT does not support date/time types, it would be nice to be able to cast those types into something (text strings or epoch numbers) that can be transferred in the MVT.

@Algunenano has mentioned that Mapnik doesn't currently support time/date for styling, and implementing the automatic casting at the plugin level would not only make those types available in MVTs, but would allow to use them to style raster tiles.

davidmanzanares commented 6 years ago

I would like a flavor of D.

Regarding the timestamp management, I think it would be best if Maps API automatically cast it to a usable form. Ideally, it would be compressed in some way (no strings).

Regarding filters, I would move the conditional logic to Maps API. I wouldn't apply filters every time since we saw this is overkill for most maps (small and medium datasets) since they won't be able to instantly refilter with just client-side logic, and the MVT sizes would be small even without filtering taken into account.

Basically, I think Maps API should return an instantiated map and a flag saying if filters were applied or not (similar to aggregation). When the filters change in the client, CARTO VL should re-instantiate if the flag indicates that Maps API filtered in the last instantiation.

Jesus89 commented 6 years ago

I reopen this to be closed after deployment.

Jesus89 commented 6 years ago

Closing this.