fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.92k stars 405 forks source link

Create a new maintenance window in 30 seconds or less #19352

Closed rachaelshaw closed 1 week ago

rachaelshaw commented 3 months ago

Goal

User story
As an end user who deletes or moves my maintenance window to a time in the past,
I want to see a new, future maintenance window in 30 seconds or less
so that I know downtime is still going to happen.

Context

Changes

Product

Engineering

ℹ️  Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

Manual testing steps

  1. Set up load test environment for calendar testing (with FLEET_GOOGLE_CALENDAR_PLUS_ADDRESSING).
    • Since we are using the same calendar for multiple events, the calendars are going to generate a lot of callbacks for each event change. So, for 100 events that are being all changed at the same time, the Fleet server will see 100*? <= 10,000 callbacks. This is larger than real life. For the purposes of load testing this feature, it should be sufficient to have 100 events on each calendar, but we can try and push it up to 1000 if things are behaving OK.
  2. Make some unrelated simultaneous change to the calendars (like create/delete a random event). Monitor DB and Redis for any spikes. All events should remain the same.
  3. Move all calendar events to the past (use move-events.go) -- make sure they are recreated.
  4. Delete events (use delete-events.go) -- make sure they are recreated.
  5. Redo the move/delete steps. And while it is happening, also trigger the calendar cron job.
    • To make cron also refetch all calendar events, set the event update times to >30 minutes but <1 day earlier, like this MySQL command: update calendar_events set updated_at = '2024-07-22 12:21:31';
  6. Move all calendar events to current time -- make sure webhooks fire in the next ~5 minutes. Use Dave's Tines instance to receive webhooks since it has more bandwidth.

Testing notes

We now have a new tools/calendar/move-events/move-events.go script that can be used to check calendar events for users, including catching duplicates.

When creating a bunch of events on the same calendar, you may see these warnings on the server:

msg="Received calendar callback, but did not find corresponding event in database" event_uuid=6782ffb0-4d4b-4110-b458-3f8962c53d85 channel_id=0ac2ef07-272d-47e4-a8df-ae3ec63cb166

This occurs because callbacks are happening before we actually saved the event in our DB. This is fine and should not happen when there is only 1 event being created on 1 calendar.

Confirmation

  1. [ ] Engineer (@____): Added comment to user story confirming successful completion of QA.
  2. [x] QA (@____): Added comment to user story confirming successful completion of QA.
rachaelshaw commented 3 months ago

@noahtalerman is this a duplicate of https://github.com/fleetdm/fleet/issues/19491?

noahtalerman commented 3 months ago

Not a duplicate for now. #19491 is a hidden config. This story is about subscribing to calendar. Then we can remove the config.

noahtalerman commented 4 weeks ago

Documentation

Fleet server watches for potential changes for up to 1 week after original event time. If event is moved forward more than 1 week, then after 1 week Fleet server will check for event changes once every 30 minutes.

These near real-time updates may add additional load to the Google Calendar API, so it is recommended to use API usage alerts or other monitoring methods. Otherwise, if the Google API is overloaded, calendar updates and/or webhooks may be delayed.

Hey @getvictor do you know if this is/will be documented in a guide?

noahtalerman commented 2 weeks ago

Documentation

Fleet server watches for potential changes for up to 1 week after original event time. If event is moved forward more than 1 week, then after 1 week Fleet server will check for event changes once every 30 minutes.

These near real-time updates may add additional load to the Google Calendar API, so it is recommended to use API usage alerts or other monitoring methods. Otherwise, if the Google API is overloaded, calendar updates and/or webhooks may be delayed.

Hey @getvictor just giving you another ping :)

Is this already documented in a guide? If not can you please help document it in one?

getvictor commented 2 weeks ago

@noahtalerman This is documented in this PR https://github.com/fleetdm/fleet/pull/20974/files

noahtalerman commented 1 week ago

Closing this issue even though the article hasn't been shipped. Product team is tracking shipping the article as part of a separate story here: #20763

fleet-release commented 1 week ago

Quick as the falcon, New window forms in the cloud, Uptime ensured, proud.