philhagen / sof-elk

Configuration files for the SOF-ELK VM
GNU General Public License v3.0
1.47k stars 274 forks source link

Best practice for local Evtx ingestion #332

Open aarislarsen opened 1 month ago

aarislarsen commented 1 month ago

What is the optimal way to ingest offline copies of extracted Windows Event Logs (evtx files) into SOF-ELK?

I love working in SOF-ELK, but I find myself in the situation over and over again, where I'm handed exported event logs from a high enough number of hosts that manual analysis becomes a pain. Whenever this happens, I reach for EvtxECmd and convert them to JSON with evtxecmd.exe -d d:\logs --json -D d:\json, of course having run evtxecmd.exe --sync first. As long as the naming convention is maintained for the output file, they get picked up easily enough by SOF-ELK when placed in the /logstash/kape/ folder, however I find over an over again that events are missing when searching through the evtxlogs-* index.

In spite of having been around for so long, I find that there is very little on this topic anywhere, so I really wanted to ask what the most effective and reliable way of getting Windows Event Logs into SOF-ELK, or if SOF-ELK simply isn't the right tool for doing this type of work at scale?

In my current use case I have event logs collected from a Windows Server 2016 via wevtutil epl, in an environment where I control all the audit settings. As such I've confirmed that the expected logging is enabled, and I can see events like 4624 and 4625 on the respective hosts. Running those logs through EvtxECmd and looking at the resulting JSON, I can here also see those events, but when ingested in SOF-ELK, none of them seems to be present. Plenty of other events are there, just not the ones I'm really looking for.

I've tried ingesting the raw Evtx files using https://github.com/blardy/evtx2elk and https://github.com/dgunter/evtxtoelk, resulting in the same issue, so I'm really at a loss as to what is going wrong here.

So how do YOU ingest Evtx when testing the SOF-ELK build?

philhagen commented 1 month ago

hello! the workflow you describe with evtxecmd is the approach I generally take. You're correct that not all events have parsers, and this is an area of ongoing development effort. If you can provide individual JSON lines as examples of what is not being handled, I can troubleshoot and see about writing a parser. (Sending by email is fine too, and obviously, I understand if some need to be sanitized.)

One important note here is that just a few days ago, I finished a significant refactoring of the KAPE parsers, to include using ECS-compatible field naming wherever possible. That's still in testing and I hope to have that released within a month or so as part of the massive ECS renaming project that's been underway for what seems like forever. (ugh.) Anyway, depending on the amount of work required to create/fix the parser, I may want to include it in the future release instead of dual-tracking to a current+future feature.

aarislarsen commented 1 month ago

sample logs.zip 20240805055604_EvtxECmd_Output (2).zip

Attached here are the raw logs as well as the converted ones. The ones that aren't being parsed are 4624 and 4625, so I don't think the issue is the need for a new parser, but rather something being broken or me doing something wrong (not an unlikely scenario).

Here's one of the JSON lines that simply aren't making it into SOF-ELK:

{"PayloadData1":"Target: NT AUTHORITY\\SYSTEM","PayloadData2":"LogonType 5","PayloadData3":"LogonId: 0x3E7","PayloadData4":"AuthenticationPackageName: Negotiate","PayloadData5":"LogonProcessName: Advapi ","UserName":"WORKGROUP\\WIN-4RLJP0JKC38$","RemoteHost":"- (-)","ExecutableInfo":"C:\\Windows\\System32\\services.exe","MapDescription":"Successful logon","ChunkNumber":0,"Computer":"WIN-4RLJP0JKC38","Payload":"{\"EventData\":{\"Data\":[{\"@Name\":\"SubjectUserSid\",\"#text\":\"S-1-5-18\"},{\"@Name\":\"SubjectUserName\",\"#text\":\"WIN-4RLJP0JKC38$\"},{\"@Name\":\"SubjectDomainName\",\"#text\":\"WORKGROUP\"},{\"@Name\":\"SubjectLogonId\",\"#text\":\"0x3E7\"},{\"@Name\":\"TargetUserSid\",\"#text\":\"S-1-5-18\"},{\"@Name\":\"TargetUserName\",\"#text\":\"SYSTEM\"},{\"@Name\":\"TargetDomainName\",\"#text\":\"NT AUTHORITY\"},{\"@Name\":\"TargetLogonId\",\"#text\":\"0x3E7\"},{\"@Name\":\"LogonType\",\"#text\":\"5\"},{\"@Name\":\"LogonProcessName\",\"#text\":\"Advapi \"},{\"@Name\":\"AuthenticationPackageName\",\"#text\":\"Negotiate\"},{\"@Name\":\"WorkstationName\",\"#text\":\"-\"},{\"@Name\":\"LogonGuid\",\"#text\":\"00000000-0000-0000-0000-000000000000\"},{\"@Name\":\"TransmittedServices\",\"#text\":\"-\"},{\"@Name\":\"LmPackageName\",\"#text\":\"-\"},{\"@Name\":\"KeyLength\",\"#text\":\"0\"},{\"@Name\":\"ProcessId\",\"#text\":\"0x260\"},{\"@Name\":\"ProcessName\",\"#text\":\"C:\\\\Windows\\\\System32\\\\services.exe\"},{\"@Name\":\"IpAddress\",\"#text\":\"-\"},{\"@Name\":\"IpPort\",\"#text\":\"-\"},{\"@Name\":\"ImpersonationLevel\",\"#text\":\"%%1833\"},{\"@Name\":\"RestrictedAdminMode\",\"#text\":\"-\"},{\"@Name\":\"TargetOutboundUserName\",\"#text\":\"-\"},{\"@Name\":\"TargetOutboundDomainName\",\"#text\":\"-\"},{\"@Name\":\"VirtualAccount\",\"#text\":\"%%1843\"},{\"@Name\":\"TargetLinkedLogonId\",\"#text\":\"0x0\"},{\"@Name\":\"ElevatedToken\",\"#text\":\"%%1842\"}]}}","Channel":"Security","Provider":"Microsoft-Windows-Security-Auditing","EventId":4624,"EventRecordId":"844","ProcessId":616,"ThreadId":3984,"Level":"LogAlways","Keywords":"Audit success","SourceFile":".\\Logs\\Security.evtx","ExtraDataOffset":0,"HiddenRecord":false,"TimeCreated":"2024-07-03T10:45:36.7957087+00:00","RecordNumber":21}

mpilking commented 1 month ago

Hi All,

I happened to see this post and thought I'd chime in on an option to send the event logs directly via Elastic's Winlogbeat agent.

I'm a big fan of EvtxECmd and the work it does via maps to normalize the data. So, ingesting in that format would be a great option to have. But another option is simply using Elastic's Wnlogbeat agent to forward them into Elasticsearch. This is what we do in our SANS FOR608 class. Here's a the config file we use (config file named for our use case winlogbeat-security-archive-evtx.yml:

# This is a basic configuration file to forward Windows security EVTX files
# directly to Elasticsearch. It uses Winlogbeat's included parsing processor
# script to extract out additional useful fields from the security event logs. 

# This config is based on documentation from Elastic for reading in EVTX files:
# https://www.elastic.co/guide/en/beats/winlogbeat/master/reading-from-evtx.html

# Review Winlogbeat's included "winlogbeat-reference.yml" for very detailed
# description of these configuration options and MANY more.

# To run this against multiple EVTX files, use a FOR loop such as the following.
# This will go recursively through G:\Elastic\Winlogbeat\logs\ looking for any 
# .evtx files and feed them into the command starting with ".\winlogbeat.exe..."

# for /r "G:\Elastic\winlogbeat\logs\sample logs\Logs" %f in (*.evtx) do .\winlogbeat.exe -e -c .\winlogbeat-security-archive-evtx.yml -E EVTX_FILE="%f"

winlogbeat.event_logs:
  - name: ${EVTX_FILE}
    no_more_events: stop 
    processors:
      - script:
          lang: javascript
          id: security
          file: ${path.home}/module/security/config/winlogbeat-security.js

winlogbeat.shutdown_timeout: 30s 

winlogbeat.registry_file: archive-security-evtx-registry.yml

output.elasticsearch.hosts: ['127.0.0.1:9200']

As described in the config comments, use a command such as the following in a CMD windows to ingest multiple archived .evtx log files from a directory:

for /r "G:\Elastic\winlogbeat\logs\sample logs\Logs" %f in (*.evtx) do .\winlogbeat.exe -e -c .\winlogbeat-security-archive-evtx.yml -E EVTX_FILE="%f"

As a quick test, I took your sample test logs and ingested them into the latest public release of SOF-ELK. This was done using Winlogbeat 7.10.2 from a Windows 10 VM (old Winlogbeat version, but still the preferred version for OpenSearch compatibility). The initial results look good. It ingested a total of 600,106 events across all the EVTX files.

image

Here's a screenshot showing the breakdown of event providers based on the different .evtx files it ingested:

image

And specific to the logon events you mentioned, here's a screenshot showing a query of some 4624 logons in Kibana (column names cleaned up):

image

Let me know if you have any questions. Ingesting recovered EVTX files is exactly a scenario we cover in our FOR608 class, so I'm happy to share what we've found works. And hear other options too!

-Mike

mpilking commented 1 month ago

I just thought of one more point to add. To forward the events into Elasticsearch in SOF-ELK from a Windows VM and use the config as-is pointing to localhost IP 127.0.0.1, you can setup an SSH session from the Windows VM to the SOF-ELK Linux VM like this, assuming WSL is installed:

ssh elk_user@<ELK-LINUX-VM-IP> -L 9200:localhost:9200 -L 5601:localhost:5601 -L 5044:localhost:5044

This would also allow you to send to Logstash (5044) and can connect to the Kibana UI (5601) using the localhost IP.

-Mike

aarislarsen commented 1 month ago

@mpilking thank you for this, this works like a charm. It's slower by a factor of three, but if it's parsing more event types then I guess that makes sense, and still very much within a reasonable time.

philhagen commented 1 month ago

Thanks, Mike! And thanks for the samples - I've got some time later today or tomorrow set aside for this and will expand the coverage. It'll be on the future ECS branch, though - so stay tuned for when it goes live. (There was a TON of refactoring in the parsers that went into that branch/functionality, and back porting to the current version would be a massive lift.)

I've been using the winlogbeat format as the model for any data sources that use EVTX files as an original source. Unfortunately, since winlogbeat doesn't appear to support non-Windows hosts, I haven't explored the native EVTX load pipeline that Mike describes. That said, the ssh pipeline (or just opening TCP/5044 on the SOF-ELK host and shipping natively) may be a way to explore that path.

mpilking commented 1 month ago

EvtxECmd does and admirable job parsing raw event logs and normalizing them in a way we can't do with native Windows events. Normalization is a problem because different developers are responsible for creating different events and they often use different field names to mean the effectively the same thing, unfortunately.

That said, the benefit of using a log shipper directly on the Windows OS is that it can use the Windows Event Log API that can read and fully render all the events properly. The API allows the shipper to get log messaging information that is not stored in the raw EVTX log files. This is likely why Winlogbeat won't run on other operating systems, since it can't do it as effectively on non-Windows hosts.

mpilking commented 1 month ago

FYI, here's a good article about this issue from Mike Cohen on the Velociraptor site: https://docs.velociraptor.app/blog/2019/2019-11-12_windows-event-logs-d8d8e615c9ca/