Open bhgrant8 opened 6 years ago
I've been thinking one Docker network with one GeoDjango container and one PostGIS container per dataset. Our datasets are so small that I'd be surprised we'd need anything like a sharded PostgreSQL server, for example. If we get to the point where we need to worry about scaling we'll probably end up having to port to a Platform-as-a-Service like OpenShift anyhow.
When I am thinking about scaling I am not so much talking about data size, to worry about sharding or any managed/platform-as-service model, we are quite far off from.
The question is more posed on organization and usability if we were to add additional endpoints to the API in the future to have to work around a bad or quick decision.
I see we could either do:
single GeoDjango container hosting single app project per data set > Single PostGIS per data set
Single GeoDjango container for project > with individual apps per data se t> individual PostGIS containers for each dataset (https://docs.djangoproject.com/en/2.0/topics/db/multi-db/)
for option 2):
If we do the single individual containers for each dataset (option 1),
You can serve an arbitrary number of unrelated databases from a single PostGIS container. Can you serve an arbitrary number of REST APIs from a single GeoDjango container?
It seems like we're thinking a separate API for each main dataset (Ridership, Crash and Congestion) with a database consisting of the core dataset and any auxiliary data we might want to JOIN with it (Census, for example). Having them being two containers each helps with capacity planning; we can measure their resource usage individually.
I guess I should start a DevOps discussion on capacity planning covering the whole platform.
Given their small size, I don't see any problem with combining the odot_crash_data and passenger_census databases into a single PostGIS container if that makes things simpler for anyone / everyone. The congestion dataset is another story.
Summary:
The Transportation Systems project will contain multiple data sets which will be exposed through our API. From both a user perspective and technical perspective what type of organization makes sense for the project?
Considerations
Some Possible options:
ex:
Benefits:
Detractors:
ex:
Benefits:
Detractors:
ex:
Benefits:
Detractors: