tsale / EDR-Telemetry

This project aims to compare and evaluate the telemetry of various EDR products.
1.71k stars 158 forks source link

Linux Telemetry Section #21

Open craighrowland opened 1 year ago

craighrowland commented 1 year ago

It would be good to break out Windows vs. Linux telemetry for EDR as the two platforms have much different coverage needs. Linux coverage can cover process attacks like Windows. However, it also has a lot of non-process based data that need to have good telemetry to detect attacks.

I'd propose as a starting point these high level-categories for telemetry type data:

Processes (process activity, creation times, owners, binary data, network activity, etc.) Files (general coverage for file attributes, creation times, owners, hashes, entropy, etc.) Directories (general directory coverage for attributes like files above, etc.) Logs (syslog, utmp, btmp, wtmp, lastlog, log data, etc.) Users (accounts, passwords, SSH keys, login activity, etc.) Kernel (kernel modules, status, etc.) Systemd (services, lingering processes, general systemd units). Scheduled Tasks (cron/at/systemd running, owners, etc.)

exeronn commented 1 year ago

I thought I'd try & get this started by mapping out SysmonForLinux & seeing how it fits in with a hybrid of the current mapping for Windows & the suggestions from @craighrowland.

There was more initial overlap than I thought there would be, if we abstract things like "Services" to include systemd or service in Linux & similar for shceduled tasks. I'm very much taking the Windows one as the lead & I'm thinking items like file attributes, creation times, etc might be Yes/Partial/No requriements rather than fields.

We also need to think if we want to include some more specific but common data feeds such as apparmour & selinux. I briefly looked at the evented tables in OSQuery to get an idea for other data sets.

LinuxEDR-v0.csv

I added the evidence for SysmonForLinix to https://github.com/exeronn/Linux-Detection/tree/main/Sysmon/EventTypes - so we can fill it out in the pull request once we've got a way of doing it.

For reference the partials are:

Process Access: It only looks to include ptrace events File Read: This may be better as a no, currently it's only raw read access that shows up in this Tampering: You can see config changes

tsale commented 1 month ago

Hi everyone! I'm creating the configuration for the Linux category. Here’s the list of events I plan to include in the table.

ProcessCreate
FileCreated
FileModified
UserLoggon
UserLoggoff
LogonFailed
ScriptContent
NetworkConnect
NetworkListen
NetworkRawSocket
URL
ScheduledTask
ProcessTerminate
UserAccountCreated
UserAccountModified
UserAccountDeleted
DriverLoad
ImageLoad
RawAccessRead
ProcessAccess
FileCreate
DnsQuery
FileDelete
ProcessTampering
ServiceCreation
ServiceModification
ServiceDeletion
AgentStart
AgentStop

I want to ensure we get everything right from the beginning, before we bring on multiple vendors and have to analyze new events. Please let me know if you agree or if you have any other suggestions that might be a good fit.

mthcht commented 1 month ago

I think it would be valuable to know which EDR can provide telemetry for eBPF events or syscall activity.

SecurityAura commented 1 month ago

Suggestion: NetworkListen.

When a process is listening on a network port.

I'm guessing here that NetworkConnect would cover both outbound (host to remote) and inbound (remote to host). If not, the "inbound" could be something akin to NetworkAccept (MDE-like terminology here).

tsale commented 1 month ago

@mthcht - I'm currently building the script that people can run to generate the telemetry. It'll be in python. Could you provide a similar method for testing for this suggested event?

@SecurityAura - I agree on having a category for network connections, but I'm not sure we need to be that explicit. I'll keep it in mind tho while I'm testing various EDRs, thanks 🙏

@madhusudanpatnaik - thanks for the suggestions. Most of those are attack techniques and are more in-line with detection events rather than telemetry. Although, I see some of the ones you mentions that we will include, like process injection, network connections etc.

Aegrah commented 1 month ago

Nice start @tsale.

You suggest driver/image load; I'm unsure whether this means shared object load and LKM load. Otherwise, an LKM load would be good to add.

Certain syscalls are very useful/important to detect. Just to mention a few: think about PTRACE() and MEMFD_CREATE() to detect certain process injection/fileless execution techniques. KILL() (typically ranging from 32 and up) can be interesting in detecting certain rootkits and MPROTECT() is useful for detecting RWX access which generally speaking is odd.

File size and file entropy are important data sources, as these will help in detecting e.g. ransomware attacks, web shells, and more.

File header bytes are also useful; given that it is often important to note whether an executable that is dropped and executed is a simple script or an actual ELF binary for example.

I'm curious to see what the ScheduledTask data source is about. This is just an execution event of a systemd, cron, at or other scheduling service. Not sure whether it makes sense for this to be a specific data source.

Similar to the ServiceCreation, ServiceModification, ServiceDeletion data sources. These just mean "file creation/modification/deletion in certain Systemd directories". Not sure whether it makes sense for those to be their own data source.

For AgentStart and AgentStop, this is just a process start or end event for a certain process name. Having that as it's own individual data source is not per se necessary to have good telemetry on an agent starting/stopping.

I would also like to point out that there is a very important distinction to make between having elaborate telemetry, and having useful data to work with. For example, file open events might be useful for SSH keys, wallets, and certain sensitive files, but having file open events for every file on the system (In Linux, everything is a file), you will flood your cluster with logs within a few minutes.

I hope some of this is useful!

tsale commented 1 month ago

Thanks for your thoughtful input @Aegrah, we're definitely on the same page, which makes me really happy! 😊

Regarding driver/image load, you're absolutely correct – it’s about visibility into LKM load. I’m currently in the process of building a telemetry generator script in Python that will specifically use system calls and related methods to test EDR visibility (without relying on binaries). For the injection-related events, I’ll indeed be leveraging PTRACE() and MEMFD_CREATE() to replicate various injection techniques.

For the scheduled task, I plan to implement a cron job with a * * * * * schedule, which will be written and executed on service reload. For services, I'll be using the dbus library to generate the necessary events, keeping it aligned with how those actions would typically be performed.

If you’re interested, you can check out the current progress of the telemetry generator script and how it’s generating telemetry for each category here: Linux Telemetry Generator Branch (still a work in progress).

As for the Agent Start/Stop events, I agree—it’s nice to have, but doesn’t necessarily need to be a standalone event in the console.

Looking forward to your thoughts!

Arignir commented 1 month ago

Hello,

Here are my two cents on the subject:

And finally, here's a few ideas for new telemetries:

tsale commented 1 month ago

Thanks for the continued support @Arignir, it’s really helpful! Regarding the CreateRemoteThread equivalent on Linux, the activity we’d expect to see under that sub-category would involve loading a shared library into the target process via dlopen.

As for services, we’re focusing on filesystem service telemetry, and I refer to my previous comment on the topic. I agree that syscall events aren’t needed, something more abstract would be better. For eBPF events, could you or @mthcht provide an example of activity that would trigger such events?

I love the user account creation/modification/deletion idea, and the URL telemetry as well! I'll be editing the above list to include those 🙂

zmallen commented 1 month ago

I imagine driver load has a much different context in Linux than Windows. I like being specific with LKM load, but there are plenty of other ways to load modules in user land: LD_PRELOAD and other modules in userland, such as PAM backdoors, fit into that category

tsale commented 1 month ago

Thanks @zmallen, what do you propose we should name this sub-category instead?

Arignir commented 4 weeks ago

IMO we should'nt try and group LKM loading and LD_PRELOAD in the same category, as they are vastly different. The first one could be labelled Linux Kernel Modules Loading or somethign similar, while the second one could be labelled dynamic linker hijacking, userland rootkits, etc.

https://attack.mitre.org/techniques/T1574/006/

zmallen commented 4 weeks ago

makes sense to me, theres a lot of research on userland rootkits outside of LD_PRELOAD too (technology specific modules for nginx, php, PAM, apache etc). Makes me wonder where eBPF rootkits sit in this too :)

since the boundary between user and kernel is a bit more pronounced, i agree we should separate

tsale commented 4 weeks ago

@Arignir Do you have implementation examples outside of what I've built so far? People would want to test it to generate the telemetry data.

Arignir commented 4 weeks ago

@tsale

Regarding CreateRemoteThread and your RemoteLibraryInjector class, ptrace() doesn't work that way: you are injecting yourself instead of injecting the remote process. You need to use ptrace_peektext to read the process' memory, patch it up with ptrace_poketext and fix the thread's registers using ptrace_getregs and ptrace_setregs to make it run your shellcode.

Regarding ProcessAccess, you're relying on psutil to iterate over the process list, which itself iterates over /proc/ and reads /proc/<pid>/comm. On Windows, Sysmon defines ProcessAccess (EventID 10) as when a process opens another process, which is similar to what ptrace() does on Linux. What your test is doing is more akin to process discovery. In retrospective, I believe the Ptrace telemetry and the ProcessAccess telemetry are more or less one and the same, and should probably be merged.

Regarding RawDeviceAccess, you're reading /dev/zero. I understand that this is probably a placeholder but EDR will likely blacklist raw access to /dev/zero as it is very commonly used. IMO it's safe to open and read the main hard drive, as long as you make sure to open it in read-only mode, but maybe someone else has a different opinion on that? You're less likely to hit the EDR's blacklist in doing so, and the test is more realistic.

I like that for User events you use a library instead of relying on useradd/usermod/userdel. EDR won't be able to cheese that telemetry by parsing the process telemetry and will have to monitor changes in /etc/passwd. It's a good idea!

For Linux Kernel Modules loading, you can take the example code from TheXcellator's tutorial on how to make a Linux rootkit, build it and load/unload it. The first example is basically just a hello world.

For NetworkListen and NetworkRawSocket (I insist those should be split from NetworkConnect), you can use socket.bind() for the first one and socket(AF_PACKET, SOCK_RAW); s.bind(("eth1", 0)) for the second one (see here for an example).

For Ptrace, like I hinted earlier, IMO the CreateRemoteThread/ProcessAccess/ProcessTampering are basically all subsets of the Ptrace telemetry. If you go through your CreateRemoteThread example you'll already have a pretty good telemetry generator for ptrace.

For eBPF, if you don't want to spend too much time learning how eBPFs works and all, you can use tracee that will implicitly load eBPFs to do its shit. You can also have a look at pamspy which is a credential dumper based on eBPF. If you wanna dig more into it, here's the eBPF documentation.

Let me know if there's anything more you need~

tsale commented 4 weeks ago

This is great; thank you for taking the time to provide your feedback and guidance @Arignir! Some info regarding your comments:

Regarding CreateRemoteThread and your RemoteLibraryInjector class, ptrace() doesn't work that way: you are injecting yourself instead of injecting the remote process. You need to use ptrace_peektext to read the process' memory, patch it up with ptrace_poketext and fix the thread's registers using ptrace_getregs and ptrace_setregs to make it run your shellcode.

What are your thoughts on this? process_hijack_demo.py. Additionally, I agree; we should consolidate into a single category named ProcessAccess, rather than having separate ones for CreateRemoteThread and ProcessAccess respectively.


Regarding RawDeviceAccess, you're reading /dev/zero. I understand that this is probably a placeholder but EDR will likely blacklist raw access to /dev/zero as it is very commonly used......

Thanks for this suggestion! I agree and fixed it lnx_telem_gen.py#L318.


Regarding Linux Kernel Modules - This is basically what I am doing :)


For NetworkListen and NetworkRawSocket (I insist those should be split from NetworkConnect), you can use socket.bind() for the first one and socket(AF_PACKET, SOCK_RAW); s.bind(("eth1", 0)) for the second one (see here for an example).

Thanks for the suggestion. I think others echoed your opinion so I've introduced the new category for NetworkListen and separated it from NetworkConnect and also introduced NetworkRawSocket. See the implementations here: lnx_telem_gen.py#L21


I’ll check out eBPF, though I think I’ll go with the tools you suggested—it’s a bit advanced for me at the moment, so I’d rather not reinvent the wheel. Looking forward to your thoughts! Thanks again 🙏

Arignir commented 3 weeks ago

Hello,

What are your thoughts on this? process_hijack_demo.py.

You're injecting a random process which is pretty dangerous: you could fuck up a critical process and freeze the user's system.

On top of that, the modifications you're doing do not result in an actual injection (the memory you're reading/patching is most likely not mapped, you're setting eip to a random address, the shellcode is for x86 32-bits, etc.), and will most likely crash the process. This might be fine, like you don't have to actually do an injection and having a couple of ptrace calls here and there can be enough, it depends on how realistic you want the test to be.

Thanks for this suggestion! I agree and fixed it lnx_telem_gen.py#L318.

LGTM! Thanks!

Regarding Linux Kernel Modules - This is basically what I am doing :)

Great!

I’ll check out eBPF, though I think I’ll go with the tools you suggested—it’s a bit advanced for me at the moment, so I’d rather not reinvent the wheel. Looking forward to your thoughts! Thanks again 🙏

Sure! Building and running pamspy can be a pretty realistic example, it's similar to what I expect an eBPF malware/rootkit will do.

Thank you very much for your efforts!

tsale commented 3 weeks ago

Thanks for pointing that out @Arignir ! I’ve fixed the part about injecting into a random process, so it’s now safer. For the realism of the modifications, we’re not too concerned with making it look completely realistic—our goal is just to cover the basics and generate the necessary telemetry. Nothing fancy, just enough to get the needed visibility.

Also, I’ve added eBPF events via pamspy (thanks for the suggestion!). I’ve created a README file for the Linux telemetry generation, which will serve as the documentation for the events we’re targeting, the actions taken to generate that telemetry, and what we’re looking for in the resulting events.

Looks like we're almost ready to start testing some EDRs 😃 Is there any other events that we need to include? Now is the time y'all, please comment here and add your two cents!

Appreciate the feedback! 🙏

Arignir commented 3 weeks ago

Thanks for your patience, time and dedication @tsale !