Open albertzaharovits opened 4 years ago
Pinging @elastic/es-security (:Security/Audit)
This is absolutely something that would help customers to tune their audit trails avoiding too much data to be logged.
I think that the challenging part is to define which are the "types" that we should use to classify our events. Most of the time, customers have to turn the general setting on because they are looking at detailed information that they cannot get without the full request, for example which are the parameters when doing a search, or which are the new values when changing settings.
If we would be able to provide a better experience for each individual action, they probably don't need the entire body anymore, except a very few cases where their compliance policy requires it. It would also be preferable, since the current body is exposed as a JSON-encoded string and it's not easy to manipulate or consume.
Do we already know how much effort it would require to introduce some filter on events that should emit the body? Does it make sense to include more details into the "default" information for those instead?
As a user of this feature, what we are after is understanding who has searched what indexes and what they were searching for. So for example we can trace that Jane User successfully searched for Guybrush Threepwood across index1, index2 and index3. We would also like to see that Jane User tried to search for poor Guybrush across index4 and index5, but was denied access. At the moment we have configured emit body to capture what Jane User was searching for (only really works on one cluster), but as a side effect we also get the results of Jane's search which can span multiple megabytes in a single document, causing Kibana to error and not show anything with the default settings. This isn't desirable behaviour for 3 reasons:
Hope this helps a little.
but as a side effect we also get the results of Jane's search which can span multiple megabytes in a single document
capturing the search results in another cluster means we now have an unnecessary secondary storage ...
Could you please clarify what do you mean by "results" and "search results"? The
xpack.security.audit.logfile.events.emit_request_body
is to enable only the request body, not the response. How did the results get captured? Thanks!
sample_redacted.txt Sorry for the slow reply. I've attached a heavily redacted version of one of the monster entries from the monitoring cluster. This is capturing a copy of the data from the main cluster as part of the entry. Twice. I've obviously pruned it right back as the thought of going through a 2.5MB document to check I have redacted all the returned data was more than I could take.
Actually looking over it again, it's at least partly tracking what the data_writer is writing to the indexes. Hopefully not everything, but I haven't the will to plough through all the data being written to the primary cluster.
Thanks @jmac-met
The large chunk of messages in the log file are request body, specifically a bulk indexing request body. That is why they are so big. The are not search results. So no concern here. But I see your point and we are aware that these large bulk indexing request bodies can become unwieldy, which is what this issue talks about: request body for "only for searches but not for indexing".
xpack.security.audit.logfile.events.emit_request_body
is used to toggle REST request body auditing. This is a coarse control. I think it makes sense to be able specify auditing the body for certain requests but not others, eg only for searches but not for indexing.