jamespfennell / transiter

Web service for transit data
https://demo.transiter.dev
MIT License
55 stars 6 forks source link

Use the PostGIS Postgres extension? #118

Closed jamespfennell closed 3 months ago

jamespfennell commented 11 months ago

PostGIS is a well established Postgres extension which adds geographic types (like points and lines) to the database. The key thing about PostGIS is that you can very efficiently and easily do geographic SQL queries, like "list all stations within X meters".

Should Transiter use PostGIS? There are already two tables/resources (stops and vehicles) that have location data in them. We currently query these using handcrafted SQL queries. Using PostGIS would make the queries more efficient (our current DB indexes don't and can't optimally index geographic data) and a bit easier (PostGIS would calculate distances rather than us having to maintain the formulas). Potentially after https://github.com/jamespfennell/gtfs/pull/9, @cedarbaum will be adding support for GTFS shapes, which again contain geographic data.

There is a single negative that I can think of, which is that users of Transiter would need to have the extension installed. I think this is not too much of a big deal because (1) there is a Docker PostGIS image so Docker users would just use that and (2) other managed databases like CockroachDB typically have PostGIS installed by default. But it's still a negative.

Wondering if @cedarbaum has thoughts as the implementor of all of the geographic related features in Transiter!

cedarbaum commented 11 months ago

This sounds awesome to me! Aside from efficiency, being able to query for entities by arbitrary shapes (e.g., a borough in NYC) would be really cool!

Also, for when shapes are added, storing them using this extension seems a lot cleaner to me. My original idea was to use a JSON column for the shape data, which I think would be OK initially but will probably require more glue code. And if we wanted to use a standard like GeoJSON at the API layer, it looks like PostGIS supports this as well: https://postgis.net/docs/ST_AsGeoJSON.html

jamespfennell commented 11 months ago

I didn't think of other use cases like being able to query in areas, but that does seem cool!

For the shape data: yeah it did seem we could use a "line string" type directly in a shapes table, rather than creating another table shape_points which is pretty nice.

GeoJSON also does look interesting, though it may not play super nice with the gRPC/protobuf API. We could consider it at the right point though.

Cool, overall seems like this would be worth doing.

jamespfennell commented 11 months ago

(doing some prototyping on https://github.com/jamespfennell/transiter/tree/postgis)