orbitalci / orbital

Orbital is a self-hosted CI system for solo/small consulting dev teams. Written in Rust.
GNU General Public License v3.0
29 stars 2 forks source link

Build experience needs allow live streaming logs and handling async canceling gracefully #243

Closed tjtelan closed 4 years ago

tjtelan commented 4 years ago

Currently orb build is completely synchronous and holds the connection open with no output while the backend performs the build.

Ideally today, logs would stream back to the user. This could probably be managed by using the docker api to stream the equivalent of docker logs back to the user, but the current endpoint output definition is not a stream. This will be important if we're going to handle builds that aren't just containerized.

If we want to cancel a build, it would be my preference to inject a build cancellation notice into the build log and do any kind of clean up with database records. I'm not sure how this event will cause the current build loop to change, but I think some kind of polling with the database may occur. (Related to #211)


Resolves #211

tjtelan commented 4 years ago

I gave this some thought, and I'm considering an approach using Redis pubsub. That is, unless I can think of a way using only grpc and postgres w/o pubsub. Personally, I'm not convinced the complexity will be worth the trouble because the scale for builders will be reasonably finite. The internet seems to hate on using DB as a queue due to considerations such as locking if you have multiple consumers.

Previous to Orbital, Ocelot used NSQ and a pubsub model and this mostly worked even though the implementation suffered from some race conditions.

I would have used pubsub via Postgres and saved deploying another component, but since Diesel does not support it for postgres (Diesel-rs's #399) and I'm not feeling up to the task of doing this with tokio-postgres (for example) I'd rather approach this using a model I'm familiar with. (Using Redis instead of NSQ because the crates have better support.)

I'm open to revisiting this model, but at the moment I think streaming logs to a client will be supported easier for non-docker builds using this approach.


My initial thinking using pubsub:

orb build will call the backend, which will queue a BuildTarget in Postgres, and publish a message into Redis, then return the metadata back to the user.

The rest of the worker BuildService implementation will be split from the existing endpoint to a service that on startup will listen to a Redis subscription. It will ultimately take work off the queue and perform the builds and listens to another subscription in the event of a cancel request

(I'm going to think more about the fail conditions, and the desired behavior since testing "losing a build" will be a very important edge case in terms of keeping the user informed as well as keeping consistency in the DB)

Logs might need to stream to Redis until a build finishes a stage. So if a client drops off, it can recollect logs for a running build. After a build moves to finishing state, the output will get saved to Postgres.

tjtelan commented 4 years ago

I rabbit-holed trying to force my current docker crate (shiplift) in build_engine. I was trying to use tokio_compat as a way to bridge to async, but I ran into a runtime error while trying to create a new runtime within build_service's runtime.

I don't particularly care about the docker crate I'm using right now, but I can't get wrapped around trying to change it at this moment...

Printing docker output works server-side, so here's what I'm going to try: I'm going to see if I can pass in a channel from build_service, and write to it in place of where I'm printing. This will give me some breathing room for one of the many async crates to get as mature as shiplift. :crossed_fingers:

tjtelan commented 4 years ago

One thing to note is that using --no-follow causes the backend to crash due to the receiving end of the channel closing because of my liberal usage of unwrap(). Will need to handle this.