fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3k stars 416 forks source link

Distributed query errors are not stored in errorstore #5862

Open roperzh opened 2 years ago

roperzh commented 2 years ago

Fleet version: (head to the "My account" page in the Fleet UI or run fleetctl --version)

main @ 70d0a54

Operating system: (e.g. macOS 11.2.3)

any

Web browser: (e.g. Chrome 88.0.4324)

any


🧑‍💻  Expected behavior

When a distributed query fails in a host, an error is stored to aid debugging

💥  Actual behavior

No errors are stored, this happens because we're only storing errors if the request fails, but for this particular endpoint we're only logging errors

https://github.com/fleetdm/fleet/blob/70d0a546eccd341a153ae05a8b62112a529def2b/server/service/osquery.go#L824-L827

@chiiph @mna should we store these errors? I understand the reasoning behind not failing the request if an error occurred in a host but seems like we should still store the error. Any thoughts?

xpkoala commented 2 years ago

@chiiph Poking to get some traction on this. Thank you!

chiiph commented 2 years ago

@chiiph @mna should we store these errors? I understand the reasoning behind not failing the request if an error occurred in a host but seems like we should still store the error. Any thoughts?

Hm, I suppose we can, the worry is that if things are failing here and we store it, we might wreck Redis. Maybe if we buffer them? Or I guess we can always try the simple approach and load test it.

mna commented 2 years ago

the worry is that if things are failing here and we store it, we might wreck Redis.

You mean because of the number of requests, right? Because storage-wise, the deduplication logic should store only one instance of repeating errors. I think with a big number of hosts this could indeed be a concern.

chiiph commented 2 years ago

You mean because of the number of requests, right?

Yeah, just hammering redis, basically. We could buffer them, though, but then it's not as simple as "let's just add these".

roperzh commented 2 years ago

gotcha, in that case it might not be worth the effort