nyaruka / courier

Messaging gateway for RapidPro/TextIt.
Other
115 stars 68 forks source link

Malicious requests overloading channel_logs table #347

Closed caioreix closed 1 year ago

caioreix commented 3 years ago

Some contacts are sending malicious messages with the intention of blocking the recipient's WhatsApp application, the content of these messages are Unicode that visually have about 10 characters but actually have 4k or more, we have also encountered the “contact bombs”, that send a request with several contacts with data also in malicious Unicode like the above. This is overloading our database, because even after blocking the contact the courier saves the request logs in the channel_logs table, we wanted to know if there is already a solution to ignore requests from a certain size and if it doesn't exist, we could discuss the possibility of developing it. A possible solution that we think about, is if the contact is blocked, do not save your requests.

johncordeiro commented 3 years ago

That's a great suggestion @caioreix , @nicpottier what do you think about the proposed solution, can we open a PR?

nicpottier commented 3 years ago

Hrmm, ya it does seem we could probably be more aggressive with blocked contacts. I hesitate to ignore their messages entirely but perhaps we could have a more aggressive limit on their message size?

How many messages are we talking about here?

johncordeiro commented 3 years ago

it's about 5 messages per second, which leads to more than 400k messages per day 😅. This is a WhatsApp contact, we've opened a Support ticket and they said there is no way to blacklist these phone numbers through Business API, so maybe ignore completely is the best thing to do. Let me know if we can open a PR for this.

nicpottier commented 3 years ago

Geez, that's pretty crazy. Does seem like the only option there is to ignore those messages entirely. @rowanseymour any thoughts here?

Only other thing I could think of would be to add yet another state to contact "block with extreme prejudice" but that doesn't seem sane. But ya, even creating empty messages at the rate of 400k a day would still be bad.

rowanseymour commented 3 years ago

I think hard blocking is reasonable. Seems a little mean that we charge a credit to handle messages from a blocked contact.

rasoro commented 3 years ago

Hi there. These days we had received many messages with contact bomb on whatsapp channels, generating many records in the table of channel_log, even with the contact blocked, as these records have the status of "Request Ignored". To get around this problem, we made an improvisation to check if the contact is blocked and thus avoiding any request record for that contact.

And even so, it continued generating record in channel_log because as some contact bombs had the message text very large, it was generating an error at the moment of DecodeAndValidadeJSON of the request, so to get around this we also made an adjustment to not write these errors in db too.

Ideas are welcome and I would be grateful if we could collaborate to create a more elegant solution to these problems.

rowanseymour commented 3 years ago

Hi @rasoro how do you know who to block prior to parsing the payload? IP address?

rasoro commented 3 years ago

Hi @rowanseymour I didn't do it that way, although it could be an idea to explore.

what i did is when the request enter on receiveEvent in the courier handler, it tries to parse the payload and the payload is over 1000000 bytes it returns an error, so i get it and return another "too large payload" error to the server. go, where I put an if error == "too large payload" then "return", before the WriteAndLogRequestError to prevent it from being inserted into the database.

All this because our main concern at the moment is the DB overload due to the size of the request being saved in channel_log, because the crackers are sending many contact bombs with approximately 1.2mb of payload per second causing the table to grow absurdly.

moreover, even if the courier parses the payload without returning an error when the message is not too big, I put a check on the handler if the contact has a blocked status and with that it doesn't generate and insert "Request Ignored" log.

rowanseymour commented 3 years ago

We could also truncate request bodies when writing channel logs - we already exclude non-text bodies so we could say we only write the first X chars of body to the channel log.

rasoro commented 3 years ago

I think it can be a great idea for "ignored request" logs (or even for any very large request). keeping only x chars containing from the header to a bit of the body to have some information about the origin of the request.

rasoro commented 3 years ago

Geez, that's pretty crazy. Does seem like the only option there is to ignore those messages entirely. @rowanseymour any thoughts here?

Only other thing I could think of would be to add yet another state to contact "block with extreme prejudice" but that doesn't seem sane. But ya, even creating empty messages at the rate of 400k a day would still be bad.

something like this would be great too, for contacts identified as malicious. do you agree?