Add persistent storage to `fleetd`

roperzh commented 1 month ago

Goal

User story
As a Fleet user,
I want `fleetd` to be able to persist important data between restarts
so that I can add and use features that otherwise would be impossible to implement.

Context

Requestor(s): @roperzh

fleetd doesn't have the ability to keep state between restarts, this has been a problem for features like:

MDM migration
fleetd logs (current implementation is not useful if fleetd is in a crashloop)
MDM disk encryption

For other features we implemented ad-hoc solutions like storing things in temporary files.

This causes friction when planning new features.

Changes

Engineering

[ ] Consider the pros/cons of adding persistent storage
[ ] Investigate database solutions (eg: bbolt, sqlite, etc) and present to the team at large (#g-mdm, #g-endpoint-ops) so we can all agree on something
[ ] Add the solution to Orbit
[ ] Consider if Fleet Desktop should have access

ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

Risk level: Low

Manual testing steps

Step 1
Step 2
Step 3

Testing notes

Confirmation

[ ] Engineer (@____): Added comment to user story confirming successful completion of QA.
[ ] QA (@____): Added comment to user story confirming successful completion of QA.

mostlikelee commented 1 month ago

if/when fleet server is offline, configrecievers will fail (or may run incorrectly)

lucasmrod commented 1 month ago

Thanks for creating this issue for a long standing fleetd issue :)

MDM migration fleetd logs (current implementation is not useful if fleetd is in a crashloop) MDM disk encryption

Could you elaborate on the kind of data that needs to be stored? Is it sensitive (IOW should it only be accessible to root?)? Is a key-value store enough?

Investigate database solutions (eg: bbolt) and present to the team at large (#g-mdm, #g-endpoint-ops) so we can all agree on something

My 2 cents. If possible, it should be anything that's implemented in pure Go. (IMO sqlite3 is the best but introduces complexity for building orbit, cgo...) bbolt and badgerdb are the popular ones.

Consider if Fleet Desktop should have access

What kind of data do we need to store for Fleet Desktop? (Maybe we can start with orbit first, root only.)

nonpunctual commented 1 month ago

@roperzh @edwardsb is there some reason the persistent store couldn't be an encrypted SQLIte db? Seems like there would be advantages:

small
secure
query with osquery table or extension or ATC
encrypt with existing PKI delivered on enroll

roperzh commented 1 month ago

@nonpunctual no reason! the issue is framed generally so whomever picks it does the research and convenes with the team at large

lukeheath commented 1 month ago

I like the boring solution for this (SQLite).

lucasmrod commented 1 month ago

Sorry to push back (I've used sqlite3 in Go client apps on previous projects and the CGO requirement because of sqlite3 being written in C part was a pain for building). If we use sqlite3 we lose cross-compiling capabilities for orbit. E.g.

We'll need self-hosted builders to build fleetd for linux arm64 (currently in WIP).
During development we won't be able to build orbit for Linux or Windows from a macOS. We'll need a Windows VM to build orbit for Windows (and all the libraries and repository in the VM).

Am not 100% against it, just sharing past experiencies with such stack.

edwardsb commented 1 month ago

Sorry to push back (I've used sqlite3 in Go client apps on previous projects and the CGO requirement because of sqlite3 being written in C part was a pain for building).

If we use sqlite3 we lose cross-compiling capabilities for orbit.

E.g.

We'll need self-hosted builders to build fleetd for linux arm64 (currently in WIP).

During development we won't be able to build orbit for Linux or Windows from a macOS. We'll need a Windows VM to build orbit for Windows (and all the libraries and repository in the VM).

Am not 100% against it, just sharing past experiencies with such stack.

There are pure go implementations of the SQLite drivers that eliminate the need for CGO and enabled cross-compilation.

https://pkg.go.dev/modernc.org/sqlite

https://github.com/glebarez/go-sqlite

https://github.com/ncruces/go-sqlite3

Are a couple alternatives that come to mind. I'm not opposed to a pure go key-value based persistence mechanism, but was thinking of how we'd more align with the SQLite nature of osquery.

Managing potential migrations and schema changes doesn't particularly sound very fun across potential millions of hosts, so it's likely even if we did reach for something like SQLite, the schema might end up as something like:

CREATE TABLE fleetd_state_store( key TEXT PRIMARY KEY, value JSON );

One advantage I can think of is a familiar format/language, rather than having to learn another DSL for Bolt (not that it's particularly complex..), we can keep using a familiar interface (SQL).

fleetdm / fleet