toxyl / ossh

... is a dirty mix of honey and tar, delivered by a fake SSH server.
Other
2 stars 1 forks source link

Add a 'dynamic' file system #4

Closed dylandreimerink closed 2 years ago

dylandreimerink commented 2 years ago

Currently we have a read-only file system, which works for simple bots which want to read a file and them bail, but this is not enough if we want to capture more advanced "attacks". It is supposedly very common to upload a binary or script and then to execute it instead of typing out each command one by one.

In memory vs on disk

On-disk RAM/memory is more limited than disk in terms of amounts of storage. By just actually storing files on the disk of the host, we have the most amount room. We get some aspects for free, like directories, path resolution, metadata.

By doing this we risk breakout, if done incorrectly, bots or clever people can break out of the honeypot and read/write from the non-sandboxed region of the disk. We can limit this by running the honeypot as a user with limited access outside its directory and by using a chroot to limit the access of the process to this subdir to avoid escaping the sandbox (this would involve creating sub processes and IPC so the main process can still write the results as usual).

In-memory In memory is more "work" since we have to emulate FS features like directories, metadata, permissions. We are limited in the amount of data we can store in such a FS so this likely only works for ephemeral data. However, it is also the easiest to secure.

Shared vs isolated

Shared A shared FS has a few advantages, namely that it is easy to implement, uses less storage and is more realistic. However, it makes it harder to track/isolate which files were changes by which session/client and it makes it so different bots can interfere with eachother.

Isolated Having a FS per client makes it easy to see what files were created or altered by individual clients. The main downside it that you have to give each client its own FS which required more resources and is more complex

Persistent vs Ephemeral

Persistant Having a FS persist over multiple sessions is the most realistic way to emulate files. This assumes bots may do something, an later come back to do actions which depend on their earlier visits. The main downside is that tracking per-session changes is more difficult and that it takes more resources since we need to store FS'es for a certain amount of time

Ephemeral After a session is done, we could just delete the files, or at least not make them accessible to the same client on subsequent visits.

Proposals

Isolated, Ephemeral, in-memory FS Give each session a in-memory FS, staring with a clone of the read-only FS, then recording all actions (creating, editing, deleting). Upon session close we save the FS to disk for analysis and discard the in-memory data.

Shared, Persistent, on-disk One of the simpelest ways to solve the issue is to just designate some sub dir somewhere and have all clients share this directory. Everyone can use it, and we clean it up once a week to avoid permanently breaking bots. However, recording the contents of files, specifically which session is responsible for which change is difficult, would require a lot of code to make work.

Isolated, Persistent, on-disk My personal favorite. We can use OverlayFS, a built-in kernel feature. We start with the base layer(the current read-only FS), each client will get its own layer on top of that. We can give each session its own layer which can be layered on top of layers created by the same client in earlier sessions. This makes it easy to see exactly what files changes in which session and compare them over time.

toxyl commented 2 years ago

I like the isolated, persistent, on-disk proposal. I have seen bots come back running different payloads and I have also seen bots logging in, trying to execute a script that would usually not be present and then logging out, so I would assume that they (or another bot) downloaded it before. This might be a challenge for the isolated approach as bot(net)s can run on multiple IPs, so e.g. bot A drops the payload, then bot B logs in and executes it. Maybe we can isolated per prefix or use a custom mapping that lets us bind a group of IPs to a specific isolated FS.

dylandreimerink commented 2 years ago

Cool, going with the OverlayFS approach, using https://github.com/moby/moby/blob/master/daemon/graphdriver/overlay2/overlay.go as inspiration for creating the mounts from Go.

The idea of isolating per prefix is a good one. But lets add that in a second iteration, will take it into account when making the initial version