Open roperzh opened 2 years ago
@chiiph Poking to get some traction on this. Thank you!
@chiiph @mna should we store these errors? I understand the reasoning behind not failing the request if an error occurred in a host but seems like we should still store the error. Any thoughts?
Hm, I suppose we can, the worry is that if things are failing here and we store it, we might wreck Redis. Maybe if we buffer them? Or I guess we can always try the simple approach and load test it.
the worry is that if things are failing here and we store it, we might wreck Redis.
You mean because of the number of requests, right? Because storage-wise, the deduplication logic should store only one instance of repeating errors. I think with a big number of hosts this could indeed be a concern.
You mean because of the number of requests, right?
Yeah, just hammering redis, basically. We could buffer them, though, but then it's not as simple as "let's just add these".
gotcha, in that case it might not be worth the effort
Fleet version: (head to the "My account" page in the Fleet UI or run
fleetctl --version
)main
@ 70d0a54Operating system: (e.g. macOS 11.2.3)
any
Web browser: (e.g. Chrome 88.0.4324)
any
🧑💻 Expected behavior
When a distributed query fails in a host, an error is stored to aid debugging
💥 Actual behavior
No errors are stored, this happens because we're only storing errors if the request fails, but for this particular endpoint we're only logging errors
https://github.com/fleetdm/fleet/blob/70d0a546eccd341a153ae05a8b62112a529def2b/server/service/osquery.go#L824-L827
@chiiph @mna should we store these errors? I understand the reasoning behind not failing the request if an error occurred in a host but seems like we should still store the error. Any thoughts?