systeminit / si

The System Initiative software
https://systeminit.com
Apache License 2.0
1.58k stars 261 forks source link

feat: rearchitect rebaser to support more scale #4638

Closed fnichol closed 1 month ago

fnichol commented 1 month ago

A rather large change which changes how a Rebaser server consumes requests for change sets and when it performs related "dependent values update" (aka "DVU") runs.

General NATS Jetstream Architecture

A Rebaser server uses 2 NATS Jetstream streams to track its work:

Tasks Stream

On the tasks stream, a single Jetstream consumer is setup to share work accross multiple Rebasers as clients. When a Rebaser starts to process a task message it continuously sends AckKind::Progress messages to keep the message from being ack'd or redelivered. If the task encounters an error, it can return a Result::Err(_) from its Naxum handler which will trigger an AckKind::Nack(None) message, causing an immediate redlivery of the message to another Rebaser.

If a task needs to be interrupted due to a graceful shutdown of a Rebaser server, the handler it will return an Error::Interupted(_) error which ensures that the message is nack'd and will be redelivered. If a task can be cleanly completed, the the handler will return a Result::Ok(_) which triggers an AckKind::Ack, causing the message to be deleted from the tasks stream and so will not be ran again.

Requests Stream

When a "process" task is running, a dedicated NATS consumer is created to exclusively process all requests for a change set in serial (that is, one at a time). This consumer is known as an "ordered consumer" and is push-based (rather than the default pull-based consumers). An ordered consumer is much lighter weight and ephemeral as far as a NATS cluster is concerned and thus should reduce the stress on NATS when many change sets are created/active over a short period of time.

This ordered consumer is set up with a timeout that detects when no message has entered the subject (or no message has been pulled into the Naxum app) within a period. When this "quiescence" period is seen, this triggers a specific graceful shutdown of the "process" task where its exit state is to return Result::Ok(_) and ack the task message. In this way, change sets which become inactive (that is, no Rebaser request message) are spun down to conserve resources and allow the Rebaser to focus only on "active" change sets.

Using an ordered consumer means that we can no longer use a message ack to delete a processed messages (this is a trick of a work queue stream). Therefore when a request has been successfully processed, the message is deleted from the stream using its sequence number. A new Naxum middleware called PostProcess provides us a way to handle this delete on an OnSuccess callback. In the OnFailure case we simply don't delete the message which means the next message is still the first and only message to process.

Another tracked Tokio task is running alongside this consuming requests task, called the SerialDvuTask. It is waiting for a tokio::sync::Notify to fire which is triggered by the request-consuming Tokio task. If, during the run of a DVU, another request is processed that requires another DVU run, then the Notify will be re-enabled. That way when the SerialDvuTask loop comes back to check, the Notify will be set thus trigger "yet one more" DVU run.

Other Structural Changes

Among other changes, some of note are:

fnichol commented 1 month ago

/try

fnichol commented 1 month ago

/try

fnichol commented 1 month ago

/try

fnichol commented 1 month ago

/try