fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.98k stars 413 forks source link

Blueprint discussion: query versioning / sharding support #249

Open nyanshak opened 3 years ago

nyanshak commented 3 years ago

Originally: https://github.com/kolide/fleet/issues/2163, re-raising here, but you can see the full discussion there.

I've been trying to figure out how to slowly roll out changes to queries, and I've been pointed to sharding, which looks like an excellent solution for new queries, but I feel could provide a better experience for modifying existing queries.

New queries: Create new query with, say, 5% rollout, then gradually increase to 100% rollout over time.

Existing queries (current situation): Create new query with a different name (since queries can't have the same name), with rollout 5%. Gradually increase to 100%, then delete the old query. This also involves alerting config (on logs or whatever) to be modified to look at both names, and there's likely to be some overlap where you'll log duplicate data (once for old query, once for new query, as the changes are rolled out).

Existing queries (proposal): Add version support for queries. As a user, you might start with v0 of the query, rolled out similar to "New queries" above. When you want to modify the query, you create a v1 of the query, with 5% rollout, and decrease rollout of v0 to 95%. Gradually you shift the balance until v1 has 100% rollout and v0 has 0% rollout.

Additional impacts: fleetctl would somehow need to support this concept, so one example (pretty rough, definitely would want to polish this with the community):

---
apiVersion: v1
kind: pack
spec:
  name: example_query
  description: Example Query
  queries:
    - name: my_awesome_query # this will be the `name` in logs
      interval: 7200
      versions:
        - name: my_query_v0  # this query defined in query spec, called my_query_v0
          version: 0
          rollout: 95
        - name: my_query_v1
          version: 1  # versions could possibly be used to tag the query (e.g., by appending to name)
          rollout: 5

This is still a fairly rough idea, but I wanted to be able to continue the discussion in the fork.

noahtalerman commented 3 years ago

@nyanshak Thank you for pulling this discussion into the fleetdm repo.

I'm attempting to reiterate the request for my own understanding. Please let me know if I'm understanding correctly:

Goal

As a Fleet user I want to be able to add a new version to an existing query. And, I don't want to manage a complex decorator based query config to do so.

Problems this solves

With this, I can roll out the query changes (new version) to only a percentage of the already targeted hosts (testing concern) and ensure that the query isn't run multiple times on the same host (performance concern).