vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.59k stars 1.55k forks source link

New `windows_event_log` source #1206

Open ghost opened 4 years ago

ghost commented 4 years ago

It might be useful to support Event Log as a source on Windows platform.

As a first step, we need a specification to better understand requirements.

jamtur01 commented 3 years ago

We should also consider a stand-alone transform for the Windows Event Log format and evtx files.

ethack commented 2 years ago

I'd really like to see Vector as an alternative to Winlogbeat for Windows log shipping.

This library could be useful (not sure about license compatibility): https://github.com/omerbenamram/evtx

ypid-geberit commented 2 years ago

Also refer to the discussion in the duplicate issue #2719. To make it short, I see adding support for the Windows event log as a can of worms. Winlogbeat already does a good job.

ethack commented 2 years ago

@ypid-geberit I read your comment and looked at the beats issue you referenced. I'd like to offer an alternate opinion.

Based on my understanding of the beats issue, those GUID mappings are desirable but not actually necessary in order to have basic Windows event log ingestion. This comment and the following one indicate that simply ingesting Windows events was possible prior to the issue being opened and the motivation was convenience.

The only linked pull request I could find shows what sort of changes would go into the GUID mappings. Note that the code in Winlogbeats is in JavaScript rather than Go. Likewise, I could see something like the mappings being implemented in VRL remaps or a Lua transform. By putting the base XML/EVTX parsing in place in Rust, you'd open up the ability for users to start implementing these types of convenience mappings and potentially contributing these to the community. I agree with you this is a can of worms, but I also don't see it as a requirement in order for Vector to have something usable.

GUID lookups and so on that are best done by the agent on the system that emits the logs.

I don't see where it's a requirement to do these on the log emitter. The mappings seem to be hard-coded and not something that requires environment specific queries. I could be wrong and I'm not even sure if that is what you meant by "best". That being said it would actually be pretty cool to be able to do live queries against Active Directory for enrichment in some log sources where a SID is available but not a user/group name. But that's a separate issue.

As for the benefits of re-creating the wheel when Winlogbeat already exists, I don't have much more to add than clay584's comment. Just wanted to link that to the discussion here since the other duplicate issue is closed.

ypid-geberit commented 2 years ago

All right. I see your point and you also got my point. A Windows event log source can be done with Vector. But some form of enrichment will be needed with the event log, at least for Windows internal logs. This is different to other logs so I wanted to point this out beforehand. AD lookup could be interesting.

But due to a change in my work responsibilities (no Microsoft anymore 😉), I guess I will drop out of this discussion.

Rushan4eg commented 2 years ago

How all the going with the progress here?

jszwedko commented 2 years ago

Hi @Rushan4eg ! This is still on our radar, but no progress yet. It is something we are thinking about scheduling for Q3.

umpa385 commented 2 years ago

just trying to get around the issue, would it be possible to pull the xml files for the windows event logs instead, and use that as a work around?

jszwedko commented 2 years ago

just trying to get around the issue, would it be possible to pull the xml files for the windows event logs instead, and use that as a work around?

Hey @umpa385 . That could work. There is a parse_xml function in VRL: https://vector.dev/docs/reference/vrl/functions/#parse_xml

umpa385 commented 2 years ago

will report back on how to get that done as a potential work around even though a native solution would be best.

considering using this https://github.com/omerbenamram/evtx#example-usage-as-library

umpa385 commented 2 years ago

Hello wanted to add a current work around that I tested (its not a fully supported solution, but works for my use case)

Using https://github.com/omerbenamram/evtx as a way to convert the evxt files into json and then having vector pick those json logs up. Please feel free to comment or reach out here if anyone want more details on this.

tshepang commented 2 years ago

@umpa385 do you maintain a fork of Vector to do this

umpa385 commented 2 years ago

@umpa385 do you maintain a fork of Vector to do this

Nope I have a local repo that I'm using to convert the evtx files to json (kind of hacky but thinking about running the .exe I mentioned in the above comment via task scheduler to convert the data to json and then have vector pick that up). That is the way to do it "locally".

The actual way I plan to implement this possibly, is seeing if I can just push the raw evtx files to a vector aggregator and then do the conversion from evtx to json on the aggregator, so that way less overhead on the endpoint itself.

Let me know if that makes sense or if you have a better way to do this. More than willing to give feedback etc. to help build out something native for windows.

prognant commented 2 years ago

To listen for windows event in realtime we could probably use https://docs.rs/winapi/latest/winapi/um/winevt/fn.EvtSubscribe.html, I'm not exactly sure what we would get from that, but with the evtx lib if needed or just plain xml parsing we should probably be able to get a native source working.

umpa385 commented 2 years ago

so wanted to add more to this. I was able to get this to kind of work in a hacky way, that isn't always 100%.

You use vector to push to the raw evtx file to a aggregator and then run the https://github.com/omerbenamram/evtx#example-usage-as-library binary to convert the evtx files to json and then reginest where you want them to go.

The issues being using vector to pull the evtx files from windows sometimes causes corruption on the evtx files itself This leads to the rust binary for the convert to not work correctly leading to blank conversions.

I was able to get this working correctly for a test env, but when trying to go into prod it fell apart, because of the issue mentioned above.

backeby commented 1 year ago

This would be an amazing feature to have

wikro commented 1 year ago

Yes, this is something we would want too!

ebusto commented 1 year ago

This is relevant to my interests and I wish to subscribe to your newsletter.

otisg commented 11 months ago

@jszwedko is this still on the roadmap? If so, what sort of timeline are you thinking? Thanks. @umpa385 thank you for sharing your approach and diligently following up with your updates. 👍

jszwedko commented 11 months ago

@jszwedko is this still on the roadmap? If so, what sort of timeline are you thinking? Thanks. @umpa385 thank you for sharing your approach and diligently following up with your updates. 👍

Unfortunately no updates at this point. We'd be happy to see a PR in this area if there is a motivated contributor. We are partially hampered by the lack of Windows expertise on the team.

pabloem commented 8 months ago

Hello team. I'm interested in picking up this item. I'll try and do the relevant research and propose a minimum design in the next few days (realistically weeks:))

umpa385 commented 6 months ago

I took another stab at this, I have a semi working version if you don't care about powershell logs. The idea is to use https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/wevtutil to export the logs and then use the file sink to pick them up via vector and send it where you want it.

Its not the best process, but create a power shell script to put the files in directory and then have vector pickup those files and have powershell overwrite every time (I was going to use a task schedule to write out the logs every 3 mins).

I'm going to try this again with winlogbeat to file instead so I can also get the powershell logs.

otisg commented 6 months ago

@umpa385 is wevtutil available in "all" versions of Windows?

umpa385 commented 4 months ago

Okay so sadly takes 2 agents, but it works, use winlogbeat to send logs over to vector. Trying to do something native is much harder right now, but this two agent system works.

umpa385 commented 4 months ago

Here is the config I have in vector. Basically setup winlogbeat as a logstash output and then vector ingests that.

`sinks: win_logs: type: http inputs:

umpa385 commented 4 months ago

sinks: win_logs: type: http inputs:

Klar commented 4 months ago

sinks: win_logs: type: http inputs: - windows_logs encoding: codec: "json" uri: >- https: auth: strategy: bearer token: compression: gzip batch: max_bytes: 1000000 sink_id: type: "aws_s3" inputs: - windows_logs bucket: key_prefix: auth: region: access_key_id: secret_access_key: assume_role: encoding: codec: "json"

can you clarify how you do it? winlogbeat (logstash output) --> vector input --> vector sink?

which vector source type do you use? address the default winlogbeat port 5045?

The data I receive in vector in .message field is gibberish.

mkfrey commented 3 months ago

Okay so sadly takes 2 agents, but it works, use winlogbeat to send logs over to vector. Trying to do something native is much harder right now, but this two agent system works.

Nice idea. I've tried it and it works for basic transfer of windows logs. However, I've found that with the default mem queue, messages are delayed for up to 40 seconds. When using a disk queue for robustness, this can even go up to 10 minutes.

So having a log source directly integrated would be a huge improvement.

umpa385 commented 3 months ago

Okay so sadly takes 2 agents, but it works, use winlogbeat to send logs over to vector. Trying to do something native is much harder right now, but this two agent system works.

Nice idea. I've tried it and it works for basic transfer of windows logs. However, I've found that with the default mem queue, messages are delayed for up to 40 seconds. When using a disk queue for robustness, this can even go up to 10 minutes.

So having a log source directly integrated would be a huge improvement.

Have you tried to use the logstash source? We are not seeing that level of delay, we are not writing it down to a file.

mkfrey commented 3 months ago

Yes, I used the logstash output of winlogbeat and the logstash source with Vector. Winlogbeat without disk introduces a delay ranging from 0 to 30 seconds.

sandervandegeijn commented 2 months ago

Since Elastic is creating somewhat of a walled garden since the licensing change I'd rather stay away from winlogbeat and use Vector natively. Alternative could be to use fluent-bit, but I've had issues with it in the past, so I'm hesitant. A native Vector solution would be nice.