fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.93k stars 409 forks source link

Store bootstrap package in S3 #19037

Closed dherder closed 3 days ago

dherder commented 4 months ago

Goal

User story
As an IT admin using a self-hosted Fleet,
I want to add a bootstrap package to Fleet and have Fleet store it for me in S3 instead of the database
so that backing up the database is cheaper.

Context

Changes

Product

Engineering

Implementation

As discussed and agreed on Slack in a convo between Noah and Martin:

ℹ️  Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

Manual testing steps

  1. Step 1
  2. Step 2
  3. Step 3

Testing notes

Confirmation

  1. [ ] Engineer (@____): Added comment to user story confirming successful completion of QA.
  2. [ ] QA (@____): Added comment to user story confirming successful completion of QA.
noahtalerman commented 4 months ago

I would rather store it in the source repository where the package originated from (like S3)

@dherder what's the customer problem/pain here?

Is the customer trying to cut the extra step of taking the package from S3 and uploading it to Fleet?

If yes, this is possible using Fleet's best practice GitOps. See the example in our starter file here: https://github.com/fleetdm/fleet-gitops/blob/main/teams/workstations.yml#L23-L24

Or, is the problem that the customer can't increase the database size? If so, about how big is the package? Maybe we can offer a recommended best practice DB size.

dherder commented 4 months ago

@noahtalerman if I had the choice, I would not store large binaries as blobs in a db. The pain relates to egress from the db as well as potential db bloat. The direct pain relating to that is backup and restore operations are more costly and of a longer duration. It is possible to increase the blob size in the db, but if storing a pointer to the file rather than the binary itself fixes the problem, why not do that? it seems like a better solution long term, and should also be considered for our software management solution.

noahtalerman commented 4 months ago

Hey @dherder got it. Do you anticipate that this will be blocker for the customer to use Fleet's MDM features in production? They won't increase the DB size.

considered for our software management solution

Our plan is to require S3 for software management. Packages will be stored here.

dherder commented 4 months ago

@noahtalerman I do see this as a blocker in this particular case. Great to know that we will use S3 for the software management piece.

noahtalerman commented 3 months ago

UPDATE: Not for a bit. Maybe 2025 (noahtalerman 2024-05-28)

@dherder do you know when customer-faltona is going to migrate to Fleet's MDM features?

dherder commented 2 months ago

@noahtalerman Having the package serve from the defined url would be the preferred mechanism to solve this problem. The real issue is the desire to host the package on some edge CDN like cloudfront, etc

dherder commented 2 months ago

@noahtalerman this feature is required for customer-starchik. The pain here behind hosting in buckets like s3 or the db is that this solution does not solve for download bottlenecks that content distribution solutions would solve for. When a device is bootstrapped, many customers have end users that are in remote regions and if a cdn could be specified, serving that package with the best download speed could be achieved.

noahtalerman commented 2 months ago

Hey @Patagonia121, heads up this didn't get designed in the current design sprint. Taking it to the next design sprint because it's a customer commit w/ 2024-09-15 date.

noahtalerman commented 2 months ago

Hey @dherder heads up, I updated this issue to user story format and moved your original issue description below for safekeeping.

If you get the chance, can you please take a look at the user story (in new issue description) and let me know what you think?


Problem

Today we store the bootstrap package in the Fleet database. As an IT admin, I want to optionally not store the package in the db because I would rather store it in the source repository where the package originated from (like S3). This also has a benefit in that it keeps the database size of the Fleet server manageable in size.

noahtalerman commented 1 month ago

Hey @lukeheath any concerns with the approach spec'd in the issue description? ("Changes" section)

Default is database. Use S3 if configured.

marko-lisica commented 1 month ago

Hey team! Please add your planning poker estimate with Zenhub @jahzielv @dantecatalfamo @gillespi314 @roperzh

lukeheath commented 1 month ago

Hey @lukeheath any concerns with the approach spec'd in the issue description? ("Changes" section)

Default is database. Use S3 if configured.

@noahtalerman Looks good to me!

noahtalerman commented 1 month ago

Hey @Patagonia121 heads up that I learned that customer-starchik also wants to serve the bootstrap package from CloudFront. In addition to hosting the package in S3.

I tracked a separate request for the "Serve from CloudFront" here: https://github.com/fleetdm/fleet/issues/20765

cc @dherder

dherder commented 1 month ago

Thanks @noahtalerman i think I requested cdn support back on the June 20 feature fest

mna commented 1 month ago

Manual QA:

Tested with the S3 storage configured, added a bootstrap package for a team, validated in the DB that the content was not in the DB (bytes column is NULL):

mysql> select * from mdm_apple_bootstrap_packages \G
*************************** 1. row ***************************
   team_id: 1
      name: dummy-bootstrap-package.pkg
    sha256: 0x061DA2FC0C9E274B08C3EEA313869987ACF1BD3D02B610397B3FBB7E2B11140A
     bytes: NULL
     token: 6f4af6d9-c930-45fd-b530-63c208022032
created_at: 2024-08-12 19:42:27
updated_at: 2024-08-12 19:42:27
1 row in set (0.00 sec)

and verified that it was stored in S3 (minio in local dev):

image

Reset my mac mini and did the DEP-enroll flow. It did enroll in my local Fleet setup.

image

After enrollment, it successfully received the bootstrap package (the dummy one that we use in tests, it adds a fleet logo at a well-known path and I checked that the logo was installed there).

image

Also, after this enrollment I deleted the bootstrap package via the Fleet UI, and triggered the cleanup cron job and verified that it properly did delete the file on S3 (minio) since it was now unused.

PezHub commented 4 weeks ago

Additional QA Notes: I was able to walk thru the same scenario as Martin from above and can confirm my freshly enrolled MBAir receives the bootstrap pkg and it's stored in the S3 Bucket with a reference in the DB with null bytes Screenshot 2024-08-19 at 11 18 17 AM

Screenshot 2024-08-19 at 11 16 28 AM

I'll have to confirm/test the other scenarios regarding existing bootstrap packages etc once this gets cut over to dogfood since I was not able to fully test in my local env

noahtalerman commented 3 days ago

Hey @Patagonia121, heads up that this customer request was shipped in 4.56 🎉

fleet-release commented 3 days ago

Bootstrap package stored, In S3, not database, Efficient, secured.