Logging and DDOS protection

hoytech / strfry

a nostr relay

GNU General Public License v3.0

505 stars 99 forks source link

Logging and DDOS protection #24

Open pjv opened 1 year ago

pjv commented 1 year ago

This morning I’m getting a whole lot (thousands) of this error message repeated in my strfry logs:

too many concurrent REQs

Unfortunately, that error is logged without the IP address that is intentionally or unintentionally DDOSing my instance.

So I turned on verbose REQ logging and found the IP address in the initial connect and manually blocked it, but that’s obviously not a sustainable workflow, so I’m thinking that maybe it would be good for certain error messages (this one for sure) to log the IP address along with the error every time so that log monitors like fail2ban could keep an eye on things and automatically impose some IP blocking and unblocking by policy.

hoytech commented 1 year ago

Good idea! It will need a bit of a refactor to log the IP on that line, but it should be possible. I'm adding that to the TODO list.

One thing that may help in the meantime: The numbers inside the square brackets correspond to the connection ID. For example:

2023-02-27 06:47:53.857 (   4.084s) [Websocket       ]INFO| [1] Connect from 127.0.0.1 compression=N sliding=N
2023-02-27 06:47:53.857 (   4.085s) [Ingester 1      ]INFO| sending error to [1]: bad msg: unknown cmd

In these two log lines, the [1] on the Connect line matches the [1] in the error line. So you could build a log scanner that keeps a mapping of connection ID to IP address.

It's not ideal, because you'd need to look arbitrarily far back in time to find the Connect log (well, at least back to server restart, at which point the conn IDs are reused), and you might have to worry about log rotations and such.

In the end, did you figure out if it was a buggy client or malicious activity?

pjv commented 1 year ago

Thanks, yeah, I understood the square bracket integer thing which is how I manually figured out which IP address was hammering me.

My guess is that as with web servers, this (intentional or unintentional DOS-ing) is going to be a pretty regular thing for nostr relays, so it would probably be good to make logging the IP address and a timestamp for all kinds of requests and errors a basic idiom so that tools like fail2ban can just watch the logs and do what they need to do to keep the server online.

I wasn’t able to figure out what was hitting the server or why. It was an IPV6 address that just mapped to the IP space of a commercial ISP in Germany - i.e. likely a home user - and I have no idea what they were requesting since strfry was just erroring the requests out.

Luckily, it was just the one host; I blocked the IP and haven’t seen that error message again since.

pjv commented 1 year ago

@hoytech if you can tell me if there is a relatively simple way to access the IP address of the host from here: https://github.com/hoytech/strfry/blob/e5ec135e78170756f5fd83a2cf4e25d611efd1ce/src/RelayReqMonitor.cpp#L44

I will try to modify that error string to include the IP address and will then be able to automatically block the boxes that are slamming my relay now too fast and furious for me to handle manually, and then submit a PR.