Handle data changes aka real-time maps

rochoa commented 6 years ago

When the data changes for a map/layer, the client should be able to get a fresh copy of it.

I envision something like:

The map/layer is instantiated in Maps API, it returns the Last-Modified header.
The library starts retrieving tiles.
Once a tile has a Last-Modified header with a younger time than the one in the Map instantiation, the library has to refresh the required data, by instantiating again the map to get the new set of URLs.

IagoLast commented 6 years ago

😍

We can be really ambitious here and get real real time ( :trollface: )

New source type RealTime
Update engine with new endpoint with a websocket or SSE

elenatorro commented 6 years ago

Ok I think we have a candidate for this task 😂

Captain-Oski commented 5 years ago

I am following up that Open issue, is it still on the shelves ?

Would love a new carto.source.realTime 😍

IagoLast commented 5 years ago

cc @davidmanzanares @oleurud

rkertesz commented 5 years ago

Not sure if this is still under development but I watched some presentations from google at sxsw a while ago where they suggested websockets. ESRI is also using websockets to stream data to the browser from their GeoEvent server. It seems like that is a good way to go, especially given that torque isn't setup to work with VL, meaning that the current VL animation examples require the client to hold all the data and just filter it client-side. IMO that is subject to poor client-side performance and very long data download times. What if a person wants to animate a year of historical data of accidents in a city or something crazy

rkertesz commented 5 years ago

@IagoLast and @rochoa This is what I am talking about re: esri solution... https://developers.arcgis.com/javascript/3/jsapi/streamlayer.html

IagoLast commented 5 years ago

@rkertesz

What if a person wants to animate a year of historical data of accidents in a city or something crazy?

As you said, CARTO-VL has the data downloaded client-side but is smart about which data downloads so the performance will be good.

I suggest you to check the animation filter to animate historical data.

rkertesz commented 5 years ago

Thank you for acknowledging my comment

Can you elaborate on what you mean when you say it is smart?

Also, back to the original post, it seems elegant to have a trigger (not necessarily in the sense of a DB trigger, but i just mean more generally an event) to stream this without refreshing the map itself. This means that the user doesn't see a flicker waiting for elements to reload.

To get to MVP, the trigger doesn't even need to be on data update but could simply be on a timer, where the user specifies need GUI to update the displayed features every minute but ideally, on v2, it would occur when the database is refreshed with new data for the displayed features.

IagoLast commented 5 years ago

@rkertesz

Can you elaborate on what you mean when you say it is smart?

Take a look at this post :)

rochoa commented 5 years ago

@rkertesz, if you know your data is evolving over time and you want a fresher version, you can call layer.update(source) to force a refresh.

About the smart and more specifically about your "animate a year of historical data of accidents in a city" use-case, the backend will aggregate the geospatial and temporal dimensions. Apart from the blog post that @IagoLast recommended you, I encourage you to take a look at the ClusterTime expression documentation. We will create better examples for those use-cases.

rkertesz commented 5 years ago

@rochoa That link to ClusterTime documentation is informative. Clustering works well when you have a viewport which shows much overlapping data. I agree that this is smart. That said, in some cases I commonly work with, there may be "only" 100 -500 sensor locations in a city. Perhaps each sensor measures 3 parameters but, to simplify the problem, we only want to look at a single parameter at a time (temperature for example) and let's assume the locations to be static to further simplify the problem. With 500 potential data points per measurement interval (e.g. /1 min), then over 1 month of data, we have up to a maximum payload of 43800 * 500 = 21,900,000 data points to download and then animate. Geospatial aggregation could potentially cut this but 500 data points can be rendered on a monitor without significant overlap if zoomed out to municipal level. A payload of 20M points isn't horrible but I don't think it is going to be seamless either.

Now, you may ask why someone would want a minute by minute animation over a month. I don't thin they do, at first. However, adding some additional intelligence to the way timeseries data are downloaded to the client could help. There are a number of possibilities I am thinking of from the top of my head.

Infill "frames" in the background, dynamically -- If the user allows progressive download, then the server can send up hourly data at first, filling in the missing timesteps in a manner kind of like the GIFs of yesteryear. -- Initially, data are on the 00:00 HH:mm interval, then a second payload brings in data on the 00:30, followed by another at 00:15, 00:45, etc. -- This allows the user to immediately interact with the sub-sampled data but still provides them the all the data they want. -- Problem is when do you stop infill? How do you identify too much data?
Forward fill high resolution data over a brief window of time -- While the 1 hr data are animated, the system attempts to fill in the subsequent 30 minutes of data from cursor time. -- This allows a user to slow down the playback and view more granular timeseries data without having to explicitly make a new request -- It may be necessary to dump old data from the browser (sort of like a ring buffer) if it gets bogged down.
React to windowing adjustments (zooming in on shorter timerange) made by user -- If the user zooms in on smaller timeframe than 1 month then download higher temporal resolution data. E.g. 1 month at 1 hour interval but 1 day at 10 minute interval, etc.
Provide user with selection list and information on download time and available memory capacity -- At first only display low time res data but user can view a widget with selectors so they can actively choose more granular timeseries data. -- Problem is that it is difficult to know the maximum payload that the browser can handle.

rochoa commented 5 years ago

Hey, @rkertesz,

First of all, thanks for your brain-dump.

The use-case you are mentioning might require a more specialised solution.

At CARTO (VL), we can aggregate the data at different dimensions, with a more granular control for geospatial and temporal dimensions: you can control how to aggregate your points within a grid and, now, you can also control how to aggregate the temporal dimension with time ranges and expressions. However, we transfer as many geometries as dimensions you define.

That's why I was saying it makes more sense to have a specialised solution where you only transfer the geometries for your different sensors and the data associated with them separately.

To be honest, and for the time being, I don't see anything similar to your proposals happening at CARTO VL. However, your ideas are really good and could sit on top of CARTO VL. You could build an application that has a better UI/UX abstracting the filtering and deciding the aggregation and granularity for based on the user inputs. CARTO (VL) is very capable, for well structured datasets, it can handle in the order of millions of data points.

I just created an example with a per-day animation of ~200 sensors with a data refresh interval of 30 minutes, data for a year, for a total of ~3.7M entries.

CartoDB / carto-vl

Handle data changes aka real-time maps #670