fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3.11k stars 427 forks source link

`nano_enrollments.last_seen_time` might cause deadlocks on 30k hosts deployments #21340

Open roperzh opened 2 months ago

roperzh commented 2 months ago

Fleet version: 4.55.0


💥  Actual behavior

After a command/profile is enqueued there's high activity and potential locking setting nano_enrollments.last_seen_time as hosts check in and receive commands.

Seen during load testing it didn't cause major issues, but it's a cause of concern.

🧑‍💻  Steps to reproduce

  1. TODO
  2. TODO

🕯️ More info

We have the same problem for osquery checkins, and we solved it by batching the updates

https://github.com/fleetdm/fleet/blob/eb1e54084e586132c72f024c09093f0f96050164/server/service/osquery.go#L89-L98

gillespi314 commented 2 months ago

Heads up, this will likely be bigger than 2

roperzh commented 2 months ago

@georgekarrv @gillespi314 agreed with Sarah, this is likely an 8, changing it to avoid people picking it with the wrong expectation. Lmk if I should bring it to estimation instead.