hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.81k stars 1.94k forks source link

Audit Logs #3962

Closed jaininshah9 closed 4 years ago

jaininshah9 commented 6 years ago

Currently, we have a requirement for auditing all the events executed by nomad clients, all allocations, de-allocations and any other command (including alloc-id, job name). I tried to look at the code to see if there is an audit logging option in Nomad, but it seems that there isn't? Is this part of your roadmap?

schmichael commented 6 years ago

Audit logging is on the roadmap for the Enterprise version of Nomad. We don't have an exact timeline at this point.

jaininshah9 commented 6 years ago

Okay. Would you accept it in OSS if I develop it?

diptanu commented 6 years ago

From my perspective, audit logging is more than security, it's about operators knowing what exactly has the client been doing when something needs to be debugged. Following the client logs, for example, to create a timeline after an outage is nearly impossible in a very busy node. Without a feature like this, it's hard to have enough evidence when a user complains something didn't work the way it should at a given point in time. In my opinion, people come for the shiny features but continue to use a tool for its reliability so hopefully, we will make it easier to make solid deployments of Nomad.

I think if this becomes an Enterprise feature it should at least be behind an interface where the OSS version has a default implementation. The Enterprise version could make it easy for folks to query the audit log, do cluster-wide aggregation/co-relations, etc.

SoMuchToGrok commented 6 years ago

Agreed @diptanu

Whenever we have outages with either a particular job or an entire node, it's almost impossible to get any data from the logs (no matter the log level). Definitely a pain point for us right now. At times it makes Nomad feel like a magic box that we don't have much visibility into. Enterprise or not, this feature is much needed.

shantanugadgil commented 6 years ago

In addition to being an operational pain...there is also a need for documenting the "rca". Such a feature would be indeed useful.

codyja commented 6 years ago

I would love to see this as well. We have a need to audit what has happened, especially for tracking down issues, who's done deployments, node drains, etc.

shikloshi commented 5 years ago

+1

camerondavison commented 5 years ago

has anyone setup something like https://github.com/seatgeek/nomad-firehose until this issue can be worked on? I was surprised that this does not already exist. I was trying to figure out why one of my jobs got deleted, and to my surprise there are no logs anywhere to say who deleted it.

schmichael commented 4 years ago

Closing this as Audit logging was released as an Enterprise feature in 0.11.0.

Watch #2126 for nomad-firehose-esque behavior. We wanted to cover the audit/compliance and programmatic-event-consumers independently as each has their own use cases.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.