DependencyTrack / hyades

Incubating project for decoupling responsibilities from Dependency-Track's monolithic API server into separate, scalable services.
https://dependencytrack.github.io/hyades/latest
Apache License 2.0
52 stars 17 forks source link

Revisit configuration management to support modification via UI #929

Open nscuro opened 7 months ago

nscuro commented 7 months ago

Most of the configuration options for new services are currently done via application.properties or environment variables.

This is in contrast to DT 4.x, which used primarily the UI. The drawback to UI configuration is that a new instance cannot easily be deployed with the desired settings in place, but requires manual adjustments.

However, there is still a requirement to have configuration options tweakable via UI. We have to find a way to allow for both models, as apparently relying on only one doesn't cut it.

nscuro commented 7 months ago

Another challenge with dynamic configuration is that we need a mechanism to notify services about changes that are impacting them. For example, when an API key for Snyk is changed, the vulnerability analyzer needs to be notified.

If we have the services poll the database every time they need the config, we're going to DoS the database. If we add caching, we need proper cache invalidation (hard), or have to live with eventually consistent configs (bad).

Typically, systems rely on ZooKeeper or etcd for use cases like this, but we can't introduce more such heavy dependencies at this point.

It may be worth looking into PostgreSQL's LISTEN / NOTIFY support: https://www.postgresql.org/docs/current/sql-notify.html

nscuro commented 5 months ago

As per further discussion, Postgres' LISTEN / NOTIFY mechanism is not a good option, as it won't work across replicas. We specifically chose to make DB interactions of services read-only to support the use of read-only replicas.

Instead, the following option was proposed:

It is possible that there is a slight delay between instance A and instance B of a service reloading their configuration. When users update configurations in the UI, the respective change will eventually make it to all affected services. This should be fine, but we need to make sure that it is clearly documented so users know what to expect.

nscuro commented 4 months ago

Another complication with the approach above:

If we're assuming that database read replicas are being used, we'll run into race conditions:

We would not have this problem if we distribute the entire configuration through Kafka, but not sure whether that's a route we want to take...

nscuro commented 4 months ago

Given there is no way to achieve a truly consistent view of configurations for all applications in the system, without:

The (hopefully final) proposal is as follows:

This should allow services like the vulnerability-analyzer to continue processing multiple thousands of records per second, without hammering the database. In the worst case, the cached configuration will be stale for up to 1min, but given the other options we have, that is actually not that bad.

Queries to fetch configuration options are lightweight. The biggest factor will be network latency, which the file-based configuration approach naturally didn't have.

nscuro commented 2 months ago

On the Quarkus side of things, I'm thinking we should plug into the existing config framework: https://quarkus.io/guides/config-extending-support#custom-config-source

By the looks of it, this can also be made to support reloads at runtime, and even change notification. Looking into it more.