OpenRailAssociation / osrd

An open source web application for railway infrastructure design, capacity analysis, timetabling and simulation
https://osrd.fr
439 stars 39 forks source link

Improve `layers/tile/{z}/{x}/{y}` route #3078

Open bloussou opened 1 year ago

bloussou commented 1 year ago

Response time of this route is not as fast as expected and could be improved in order to have a smoother experience when adding layers layers on the map.

Redis database is here to hide really poor query performances atm.

With some changes query time can be drastically lowered (from 2.7s to 150ms) in the video the redis cache is deactivated (code is not trying to get or set a value in the redis database) code can be found here. Screencast from 02-02-2023 16:48:01.webm

Performance tracking:

SQL queries

Track sections request

Current request, no processing in SQL will be done later using rust : 350ms

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geom, track_section.data - 'geo' - 'sch' AS data FROM osrd_infra_tracksectionlayer layer CROSS JOIN bbox inner join osrd_infra_tracksectionmodel track_section on track_section.obj_id = layer.obj_id and track_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ```

Using jsonb operators and wrapping in a `json` with `json_build_object()` : 450ms

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, json_build_object( 'length', track_section.data::jsonb -> 'length', 'extensions_sncf_track_name', track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'track_name', 'curves', track_section.data::jsonb ->> 'curves', 'extensions_sncf_line_name', track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_name', 'loading_gauge_limits', track_section.data::jsonb ->> 'loading_gauge_limits', 'extensions_sncf_line_code', track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_code' ) AS data FROM osrd_infra_tracksectionlayer layer CROSS JOIN bbox inner join osrd_infra_tracksectionmodel track_section on track_section.obj_id = layer.obj_id and track_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND schematic && bbox.geom AND ST_GeometryType(schematic) != 'ST_GeometryCollection' ```

Using jsonb operators and returning flatten data : 350ms

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, track_section.data::jsonb -> 'length' as length, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'track_name' as extensions_sncf_track_name, track_section.data::jsonb ->> 'curves' as curves, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_name' as extensions_sncf_line_name, track_section.data::jsonb ->> 'loading_gauge_limits' as loading_gauge_limits, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_code' as extensions_sncf_line_code FROM osrd_infra_tracksectionlayer layer CROSS JOIN bbox inner join osrd_infra_tracksectionmodel track_section on track_section.obj_id = layer.obj_id and track_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND schematic && bbox.geom AND ST_GeometryType(schematic) != 'ST_GeometryCollection' ```

Same as above but wrapping in a postgis mvt tile (no usage of the crate mvt in rust later) : 380ms

```sql with mvtgeom as ( WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsMVTGeom(geographic, bbox.geom, 4096, 64) AS geom, track_section.data::jsonb -> 'id' as id, track_section.data::jsonb -> 'length' as length, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'track_name' as extensions_sncf_track_name, track_section.data::jsonb ->> 'curves' as curves, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_name' as extensions_sncf_line_name, track_section.data::jsonb ->> 'loading_gauge_limits' as loading_gauge_limits, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_code' as extensions_sncf_line_code FROM osrd_infra_tracksectionlayer layer CROSS JOIN bbox inner join osrd_infra_tracksectionmodel track_section on track_section.obj_id = layer.obj_id and track_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND geographic && bbox.geom AND ST_GeometryType(schematic) != 'ST_GeometryCollection' ) SELECT ST_AsMVT(mvtgeom.*, 'track_sections') as mvt_tile FROM mvtgeom; ```

Same as above but create a flat sql table to avoid storing costly jsonb in the database : 100ms

To do this first you need to run this scripts one after the other to create a flat table (would be done at db initialization if we can rework the model) Ceeate table: ```sql create table osrd_infra_track_sections_data ( id varchar(255), curves Text, length Numeric, extensions_sncf_line_code Integer, extensions_sncf_line_name Text, extensions_sncf_track_name Text, loading_gauge_limits text, infra_id int4 ); ``` fill table: ```sql INSERT INTO osrd_infra_track_sections_data SELECT track_section.data::jsonb ->> 'id', track_section.data::jsonb ->> 'curves', cast (track_section.data::jsonb -> 'length' as numeric), cast (track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_code' as integer), track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_name', track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'track_name', track_section.data::jsonb ->> 'loading_gauge_limits', track_section.infra_id FROM osrd_infra_tracksectionmodel track_section; ``` Query the new flat table ```sql with mvtgeom as ( WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsMVTGeom(geographic, bbox.geom, 4096, 64) AS geom, track_section.length, track_section.extensions_sncf_track_name, track_section.curves, track_section.extensions_sncf_line_name, track_section.loading_gauge_limits, track_section.extensions_sncf_line_code FROM osrd_infra_tracksectionlayer layer CROSS JOIN bbox inner join osrd_infra_track_sections_data track_section on track_section.id = layer.obj_id and track_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND geographic && bbox.geom AND ST_GeometryType(schematic) != 'ST_GeometryCollection' ) SELECT ST_AsMVT(mvtgeom.*, 'track_sections') as mvt_tile FROM mvtgeom; ``` This is without optimizations (no index for instance !!! looks like it could be possible to have queries close to `50ms` !!!).

Speed sections request

Current request, no processing in SQL will be done later using rust : 1.6s

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, speed_section.data as data FROM osrd_infra_speedsectionlayer layer CROSS JOIN bbox inner join osrd_infra_speedsectionmodel speed_section on speed_section.obj_id = layer.obj_id and speed_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND (not (speed_section.data @? '$.extensions.lpv_sncf.z')) AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ```

Using jsonb operators and wrapping in a `json` with `json_build_object()` : 1.7s

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, json_build_object( 'track_ranges', speed_section.data::jsonb ->> 'track_ranges', 'speed_limit_by_tag', speed_section.data::jsonb -> 'speed_limit_by_tag', 'id', speed_section.data::jsonb -> 'id', 'speed_limit', speed_section.data::jsonb -> 'speed_limit' ) AS data FROM osrd_infra_speedsectionlayer layer CROSS JOIN bbox inner join osrd_infra_speedsectionmodel speed_section on speed_section.obj_id = layer.obj_id and speed_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND (not (speed_section.data @? '$.extensions.lpv_sncf.z')) AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ```

Using jsonb operators and returning flatten data : 1.6s

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, speed_section.data::jsonb ->> 'track_ranges' as track_ranges, speed_section.data::jsonb -> 'speed_limit_by_tag' as speed_limit_by_tag, speed_section.data::jsonb -> 'id' as id, speed_section.data::jsonb -> 'speed_limit' as speed_limit FROM osrd_infra_speedsectionlayer layer CROSS JOIN bbox inner join osrd_infra_speedsectionmodel speed_section on speed_section.obj_id = layer.obj_id and speed_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND (not (speed_section.data @? '$.extensions.lpv_sncf.z')) AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ```

Same as above but wrapping in a postgis mvt tile (no usage of the crate mvt in rust later) : NOT DONE

NOT DONE YET AS `speed_limit_by_tag` is a dict with unknown keys ! ```sql ```

Signals request

Current request, no processing in SQL will be done later using rust : 350ms

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, signal.data AS data FROM osrd_infra_signallayer layer CROSS JOIN bbox inner join osrd_infra_signalmodel signal on signal.obj_id = layer.obj_id and signal.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ```

Using jsonb operators and wrapping in a `json` with `json_build_object()` : 900ms

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, json_build_object( 'extensions_sncf_comment', signal.data::jsonb -> 'extensions' -> 'sncf' ->> 'comment', 'extensions_sncf_angle_sch', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'angle_sch', 'extensions_sncf_angle_geo', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'angle_geo', 'extensions_sncf_label', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'label', 'extensions_sncf_value', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'value', 'track', signal.data::jsonb -> 'track', 'extensions_sncf_default_aspect', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'default_aspect', 'extensions_sncf_is_operational', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'is_operational', 'extensions_sncf_is_lightable', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'is_lightable', 'extensions_sncf_installation_type', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'installation_type', 'direction', signal.data::jsonb -> 'direction', 'logical_signals', signal.data::jsonb ->> 'logical_signals', 'id', signal.data::jsonb -> 'id', 'linked_detector', signal.data::jsonb -> 'linked_detector', 'extensions_sncf_side', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'side', 'sight_distance', signal.data::jsonb -> 'sight_distance', 'extensions_sncf_type_code', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'type_code', 'position', signal.data::jsonb -> 'position', 'extensions_sncf_is_in_service', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'is_in_service', 'extensions_sncf_support_type', signal.data::jsonb -> 'extensions' -> 'sncf' -> 'support_type' ) AS data FROM osrd_infra_signallayer layer CROSS JOIN bbox inner join osrd_infra_signalmodel signal on signal.obj_id = layer.obj_id and signal.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ```

Using jsonb operators and returning flatten data : 750ms

```sql WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsGeoJson(geographic) AS geo_json, track_section.data::jsonb -> 'length' as length, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'track_name' as extensions_sncf_track_name, track_section.data::jsonb ->> 'curves' as curves, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_name' as extensions_sncf_line_name, track_section.data::jsonb ->> 'loading_gauge_limits' as loading_gauge_limits, track_section.data::jsonb -> 'extensions' -> 'sncf' -> 'line_code' as extensions_sncf_line_code FROM osrd_infra_tracksectionlayer layer CROSS JOIN bbox inner join osrd_infra_tracksectionmodel track_section on track_section.obj_id = layer.obj_id and track_section.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND schematic && bbox.geom AND ST_GeometryType(schematic) != 'ST_GeometryCollection' ```

Same as above but wrapping in a postgis mvt tile (no usage of the crate mvt in rust later) : 180ms

```sql with mvtgeom as ( WITH bbox AS ( SELECT TileBBox(7, 64, 44, 3857) AS geom ) SELECT ST_AsMVTGeom(geographic, bbox.geom, 4096, 64) AS geom, signal.data::jsonb -> 'extensions' -> 'sncf' ->> 'comment' as extensions_sncf_comment, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'angle_sch' as extensions_sncf_angle_sch, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'angle_geo' as extensions_sncf_angle_geo, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'label' as extensions_sncf_label, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'value' as extensions_sncf_value, signal.data::jsonb -> 'track' as track, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'default_aspect' as extensions_sncf_default_aspect, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'is_operational' as is_operational, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'is_lightable' as extensions_sncf_is_lightable, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'installation_type' as extensions_sncf_installation_type, signal.data::jsonb -> 'direction' as direction, signal.data::jsonb ->> 'logical_signals' as logical_signals, signal.data::jsonb -> 'id' as id, signal.data::jsonb -> 'linked_detector' as linked_detector, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'side' as extensions_sncf_side, signal.data::jsonb -> 'sight_distance' as sight_distance, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'type_code' as extensions_sncf_type_code, signal.data::jsonb -> 'position' as position, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'is_in_service'as extensions_sncf_is_in_service, signal.data::jsonb -> 'extensions' -> 'sncf' -> 'support_type' as extensions_sncf_support_type FROM osrd_infra_signallayer layer CROSS JOIN bbox inner join osrd_infra_signalmodel signal on signal.obj_id = layer.obj_id and signal.infra_id = layer.infra_id WHERE layer.infra_id = 1 AND geographic && bbox.geom AND ST_GeometryType(geographic) != 'ST_GeometryCollection' ) SELECT ST_AsMVT(mvtgeom.*, 'track_sections') as mvt_tile FROM mvtgeom; ```

layers/tile/7/64/44 route times

Current jsonb and wrapping in data flat Flat and tile generation in sql Rework data model and mvt tile generation in sql no calls to add_tags_*, mvt crate benchmark
tracks 2.4s 1.4s 1.2 150ms 0.5s
Signals 1.4s 1.45s 0.28s
speed sections 3.1s 3.3s

crate::mvt

This crate is pretty bad at tile generation when many tags must be added (see column without add_tags_*), compared to postgis there is no match.

Conclusion

  1. crate::mvt is very slow
  2. Storing and manipulating jsonb make our life harder and slower ! We should rework the sql data model to have something flat and explicit and avoid json parsing everywhere in the code. Like the video is showing at the begining of the ticket we can have something very fast without redis
  3. If above is done : drop redis as it is just adding a complication in the architecture
  4. if storing json is a requirement then since we need to access nested data maybe a NoSQL model is the solution:

    
    JSON data can be stored in a traditional SQL database by adding a column that contains JSON data as a string. This approach can be appropriate when:
    
    The JSON data is relatively small and infrequently changed.
    The structure of the JSON data is simple, and most of the queries only require simple lookups or filters.

On the other hand, if the JSON data is large, frequently changed, or the queries are complex and require nested operations, it may be more appropriate to use a NoSQL database like MongoDB that is specifically designed to store and query JSON data.

In general, it's important to consider the data access patterns, scalability, and performance requirements when deciding whether to store JSON data in a SQL database or a NoSQL database.


5. This put us in a situation where cache are required at a lot of places and caching data is a complex task.
6. These very long queries will drive to a snowball effect on the database (if one of this query acquire a lock on a row it will be very very hard for other queries not to be slowed down).

# TODO

Manage to do the solution with `Flatten jsonb and mvt tile generation using postgis` to have something faster without changing the sql data model and benchmark it for all layers

# Why do we store json in the db

Storing jsonb in the db was initially done for the request `get_infra_with_ext.sql` (~4.4s) and `get_infra_no_ext.sql` (~5.5s). Even these queries are slow atm.
ttsirkia commented 1 year ago

Excluding the infra editing, creating static tiles could be one solution. This tool https://github.com/felt/tippecanoe originating from MapBox works very well for this kind of purposes.

ttsirkia commented 1 year ago

Here is an announcement of the fork: https://felt.com/blog/erica-fischer-tippecanoe-at-felt It can optimize the details for tiles when zooming out etc.

ttsirkia commented 1 year ago

This https://github.com/maptiler/tileserver-gl with its tileserver-gl-light is very nice option for hosting/providing the tiles created with Tippecanoe.

multun commented 1 year ago

Generating tiles ahead of times is indeed the easiest way around. However, as we want to be able to edit infrastructure, we still have to figure out which tiles changed, and re-generate these on the fly.

here's what we currently do:

As would still have to do these things even if all tiles were generated ahead of time, I'd rather try to make these steps as fast as possible. We're probably doing something wrong here!

I believe it's currently quite slow for two reasons:

ttsirkia commented 1 year ago

Editing and browsing infra of course have totally different requirements. And I understand that if both are needed, it is not a good idea to provide two totally different approaches. However, using Tippecanoe also helps to prevent this kind of issues out of the box https://github.com/DGEXSolutions/osrd/issues/3164

bloussou commented 1 year ago

Postponed until it really creates issues or difficulties. As discussed, we know it can be better but we have other priorities at the moment