Indicators stored in multiple end-points

doorleyr commented 4 years ago

Following from the discussion today: there are multiple modules which aggregate their results to create indicators. Currently there is only one /indicators end-point which drives the radar plot. In order to prevent the modules from overwriting each other's updates, I'm currently only allowing one module (CS_Urban_Indicators) to make the updates and this module is responsible for collecting the indicators from the other modules. This is currently causing large latencies in updating the radar chart. To avoid this, I propose we have multiple end-points for the different categories of indicators. I have created these end-points for the corktown table:

https://cityio.media.mit.edu/api/table/corktown/density_indicators https://cityio.media.mit.edu/api/table/corktown/diversity_indicators https://cityio.media.mit.edu/api/table/corktown/proximity_indicators https://cityio.media.mit.edu/api/table/corktown/mobility_indicators

RELNO commented 4 years ago

I'm not sure I understand why latency is caused; if the radar module is collecting output of multiple other modules, why can't it update as soon as new data arrived (regardless of other modules)? In any case, multiple indicators endpoints are creating significant overhead on the FE, since I'll have to call each of these end points every 100ms (or so)

doorleyr commented 4 years ago

I'll try to explain a bit better. Right now, the density and diversity indicators are updated by one script and the accessibility indicators are computed by another script. These two could potentially be combined but there is at least one other module that will compute indicators and there may be more in the future so I think we need this flexibility. I don't want to allow multiple modules to get and post the full set of indicators because they may overwrite each other's data. For example, the accessibility module might get the latest indicators, compute an update and post the updated data- but while it was computing, the indicators module may have posted it's own update which now gets overwritten.

So I can let only module update the /indicators endpoint. The process I'm following at the moment for sending the updated accessibility indicators to the front end is as follows:

Changes occur to GEOGRIDDATA at front end
accessibility module detects changes in GEOGRIDDATA, computes the accessibility indicators and posts them to an end-point (/ind_access) which is just for accessibility indicators (it's never read by the front-end)
The main urban_indicators module detects changes in the accessibility indicators (/ind_access) and posts the full set of indicators to the /indicators end-point.

This means that- after the GEOGRIDDATA is updated- there are multiple latencies in updating accessibility (the accessibility module checking the GEOGRIDDATA hash every N ms and the indicators module checking the /ind_access hash every M ms. If we use the approach I suggested, it would remove the second latency.

I'm ok with continuing this if it's the best way but I think the way I suggested before will be faster overall.

Another possible solution would be to have only one indicators end-point (/indicators) but to allow me to post directly to sub-fields (/indicators/density, /indicators/proximity etc.). For this particular problem I think it's actually the best solution, but as I understand, this is something that you want to avoid. @yasushisakai what do you think?

yasushisakai commented 4 years ago

If it's better to be able to post deep nodes I can work on that. In fact, I think being able to choose the best data structure for each project is better. I will put it in my stack.

for the time being, @doorleyr 's suggestion having multiple endpoints for the indicators is the solution. I can see that the frontend can call get requests concurrently.

RELNO commented 4 years ago

closing after conv. w/ @doorleyr. the issue looks like network speed, not modules.

CityScope / CS_cityscopeJS

Indicators stored in multiple end-points #60