osquery / osquery-go

Go bindings for osquery
MIT License
388 stars 79 forks source link

Should we add mutexs to osquery-go? #98

Closed directionless closed 1 year ago

directionless commented 2 years ago

As I understand it, osquery's thrift interface is single threaded. But, this isn't mirrored in this SDK. We leave it to the caller to insure that Query is not called concurrently. If the caller does not respect that single-threaded nature, thrift will emit a slew of socket errors.

I see that the server side already has mutexes, but the client side does not. And while it's easy for the client to wrap the occasional Query in a mutex, it gets thornier with all the entry points. On the client side, we have at least Query, QueryRow, and QueryRows.

I think it would be pretty simple, and correct, to add a mutex to the Client.

I think it would be less simple, but still correct, to add a mutex that could be shared between the server and the client. I'm not sure how that would work with the current API. Best I've got is a package level global with a mutex per path. Or possible finding a way to expose serverClient from the server for client functionality on a shared mutex.

I'm curious what @zwass thinks

If the caller does not mutex, and there are conflicting writes, you'll see errors like i/o timeout or out of order

directionless commented 2 years ago

Thinking more, I think the client and server should not share a mutex. Today we have tables that call Query to generate results. So clearly, those aren't totally exclusive. And thrift is probably full of weird callbacks.

But we should probably add mutexs to the client

zwass commented 2 years ago

Certainly seems we should document the expectations at a minimum, though I'm on board with adding a mutex.

IIRC there's a separate socket for each side of the client/server, so a separate mutex makes sense to me.

directionless commented 2 years ago

You are correct -- Since my initial report, I've noticed that there is a separate server socket. Some initial registration happens on the client one, then it hands off. I think it would be correct to run that initial registration through the client socket, but maybe not an easy patch.

I'll see about getting up a PR with either a mutex or a rate.Limiter. The latter feels like it might have nicer timeout semantics. (Update: Maybe not limiter, though I think it's a cleaner API, I don't love the context requirements on every call)