Open FFY00 opened 3 years ago
The Citus docs asks us to contact them if we need multiple coordinators, it's a little bit weird. http://docs.citusdata.com/en/v10.0/admin_guide/cluster_management.html#adding-a-coordinator
Okay, I think the safest bet for the database is to go with standard SQL, because it's relational, has great tooling already, and has several possible server options (eg. PostgreSQL, CockroachDB, SQLite, etc).
Using plain kafka and pydantic to orchestrate the builds is not optimal, as that would require us to implement a task queue ourselves. I am leaning to https://github.com/faust-streaming/faust instead.
Faust is not a good fit for this, as we need multiple workers to share the load evenly, so I am going with celery. Hopefully, celery will gain support for a distributed broker in the future.
I have been thinking about this for a few months and ended up with this.
The webapp and builders communicate via webhook events. The builders might be a celery cluster, a custom app that triggers a Github actions workflow, etc. They will store the built wheels on AWS S3, or MinIO.
I am not sure if we should split the uploading to a separate component. The idea would be that the webapp would trigger the builders, and these would report the outcome to the uploader component. As the uploader needs the secret keys, we could have the webapp have write-only permission to the secrets database and only allow the uploader to read, isolating it.
Updated diagram.
The architecture should be fairly straightforward, a webapp with a set of build nodes. The main consideration of what technologies to use is scalability, so we should choose components that have a distributed architecture. One of the other consideration points is licensing, we should go with FOSS offerings.
For the webapp framework, I think I want to go with Starlette.
To orchestrate the builds, I think probably the best option is kafka. We can use pydantic to serialize Python objects to JSON, providing a nice and natural interface.
For the database, we have a few options:
Going with an SQL option, makes things slightly easier for testing and might make things slightly easier for people rolling their own architecture which does not need to be scaled up.