crowdsecurity / hub

Main repository for crowdsec scenarios/parsers
https://hub.crowdsec.net
160 stars 147 forks source link

CrowdSec has a huge Matrix problem #547

Open ethindp opened 2 years ago

ethindp commented 2 years ago

Okay, so... Crowdsec has a major problem with Matrix. I just had to put my entire CrowdSec installation into simulation mode globally because it keeps generating a huge number of false positives and effectively isolating my server from the internet. What's the best way of going about fixing this problem? I need to whitelist a few subdomains, then explicitly allow port 8448 (and probably a bunch of others). I could just uninstall all the http-based scenarios, but I really don't want to do that because those are actually useful, just not in their current configuration.

buixor commented 2 years ago

Hello,

Are you able to share logs generated by matrix ? A list of the scenarios that triggered false positives might be useful too :)

A possibility is to have some matrix-specific whitelists when it is relevant :)

LaurenceJJones commented 2 years ago

link to task #575

tetsumaki commented 1 year ago

Same here from my synapse server with:

It is normal to have frequent 404 errors.

cscli alertes inspect -d id:

- Date: 2022-11-21 16:51:07 +0000 UTC
+-----------------+------------------------------------------------------------------------------+
|       KEY       |                                    VALUE                                     |
+-----------------+------------------------------------------------------------------------------+
| ASNNumber       |                                                                         3215 |
| ASNOrg          | Orange                                                                       |
| IsInEU          | true                                                                         |
| IsoCode         | FR                                                                           |
| SourceRange     | 92.184.96.0/19                                                               |
| datasource_path | /var/lib/caddy/log/matrix.domain.tld.log                                     |
| datasource_type | file                                                                         |
| http_path       | /_matrix/media/r0/preview_url?url=https%3A%2F%2Fapt-cacher.net.xxx.fr%3A3142 |
| http_status     |                                                                          404 |
| http_user_agent | Element/1.5.7 (Gigaset GS290;                                                |
|                 | Android 10; e_GS290-user                                                     |
|                 | 10 QQ3A.200805.001                                                           |
|                 | eng.root.20221031.181139                                                     |
|                 | dev-keys,dev-release; Flavour                                                |
|                 | FDroid; MatrixAndroidSdk2                                                    |
|                 | 1.5.7)                                                                       |
| http_verb       | GET                                                                          |
| log_type        | http_access-log                                                              |
| service         | http                                                                         |
| source_ip       | 92.184.xxx.x                                                                 |
| target_fqdn     | matrix.domain.tld                                                            |
| timestamp       | 2022-11-21T16:51:07Z                                                         |
+-----------------+------------------------------------------------------------------------------+

I found several ways to fix this with postoverflows whitelist:

Normal:

# cat /etc/crowdsec/postoverflows/s01-whitelist/whitelist-matrix.yaml (generic) :

name: tetsumaki/matrix
description: "custom matrix whitelist"
whitelist:
  reason: "whitelist false positive for matrix"
  expression:
    - evt.Overflow.Alert.Events[0].GetMeta('http_path') startsWith Lower('/_matrix/')
    - evt.Overflow.Alert.GetScenario() == 'crowdsecurity/http-probing'

# cat /etc/crowdsec/postoverflows/s01-whitelist/whitelist-matrix.yaml (target_fqdn) :

name: tetsumaki/matrix
description: "custom matrix whitelist"
whitelist:
  reason: "whitelist false positive for matrix"
  expression:
    - evt.Overflow.Alert.Events[0].GetMeta('target_fqdn') == 'matrix.domain.tld'
    - evt.Overflow.Alert.GetScenario() == 'crowdsecurity/http-probing'

# cat /etc/crowdsec/postoverflows/s01-whitelist/whitelist-matrix.yaml (datasource_path) :

name: tetsumaki/matrix
description: "custom matrix whitelist"
whitelist:
  reason: "whitelist false positive for matrix"
  expression:
    - evt.Overflow.Alert.Events[0].GetMeta('datasource_path') == '/var/lib/caddy/log/matrix.domain.tld.log'
    - evt.Overflow.Alert.GetScenario() == 'crowdsecurity/http-probing'

Or more aggressive:

# cat /etc/crowdsec/postoverflows/s01-whitelist/whitelist-matrix.yaml (generic) :

name: tetsumaki/matrix
description: "custom matrix whitelist"
whitelist:
  reason: "whitelist false positive for matrix"
  expression:
    - evt.Overflow.Alert.Events[0].GetMeta('http_path') startsWith Lower('/_matrix/')
    - evt.Overflow.Alert.GetScenario() in ['crowdsecurity/http-probing', 'crowdsecurity/http-crawl-non_statics']

# cat /etc/crowdsec/postoverflows/s01-whitelist/whitelist-matrix.yaml (target_fqdn) :

name: tetsumaki/matrix
description: "custom matrix whitelist"
whitelist:
  reason: "whitelist false positive for matrix"
  expression:
    - evt.Overflow.Alert.Events[0].GetMeta('target_fqdn') == 'matrix.domain.tld'
    - evt.Overflow.Alert.GetScenario() in ['crowdsecurity/http-probing', 'crowdsecurity/http-crawl-non_statics']

# cat /etc/crowdsec/postoverflows/s01-whitelist/whitelist-matrix.yaml (datasource_path) :

name: tetsumaki/matrix
description: "custom matrix whitelist"
whitelist:
  reason: "whitelist false positive for matrix"
  expression:
    - evt.Overflow.Alert.Events[0].GetMeta('datasource_path') == '/var/lib/caddy/log/matrix.domain.tld.log'
    - evt.Overflow.Alert.GetScenario() in ['crowdsecurity/http-probing', 'crowdsecurity/http-crawl-non_statics']
ethindp commented 1 year ago

@buixor Not anymore; I've migrated servers and currently don't have CrowdStrike set up, and now I've completely switched to docker (with Traefik as the front-end server, but even that's containerized), so I have no idea if CrowdStrike would even be able to pick up HTTP-based events from Traefik. I, however, have my suspicions. In particular, I'm like 99 percent sure that this has to do with federation. Matrix is a complex set of specifications, and some of those (e.g. The Server-Server API) require that bots (that is, Matrix servers) contact other servers on port 8448 (but this is by no means the default, either, but let's use that since it's the default) and send federation requests. This allows for fully decentralized communications. On small servers that don't federate with much, this probably only happens 30-50 times a minute (perhaps a bit more, I've no way to know). But on large instances that federate with a lot of communities, like mine, it can happen thousands of times, if not tens of thousands. I have no way of acquiring "times per minute"-ish metrics, however, so I can't give you truly accurate numbers. But the frequency of federation requests (or, really, any server-to-server requests) would, without context, possibly look like a potential HTTP DDoS, since all of this happens transparently and in an automated fassion.

ethindp commented 1 year ago

Okay, so I have CrowdSec installed (I used CrowdStrike in my previous comment, sorry about that). I'm not really sure how to get the specific requests that cause CrowdSec to ban Matrix servers, but one possibility is a whitelist that ignores crowdsecurity/http-probing and http-crawl-non_statics for certain domains only. The playbook I'm using has recently migrated to using Traefik, though Nginx is still used for some things. The eventual goal is to migrate entirely over to Traefik. Is this possible at the moment? I currently am using cscli simulation enable --global to ensure that everything is just simulated and no actual action is taking (the bouncer is also not running) so I can figure out how to prevent CrowdSec from freaking out about Matrix and risking banning everyone who tries connecting to my server.

VPaulV commented 1 year ago

@tetsumaki solution works great, thank you

helkaluin commented 1 year ago

After an upgrade to 1.5.3 @tetsumaki 's solution stopped working somehow.

LaurenceJJones commented 1 year ago

After an upgrade to 1.5.3 @tetsumaki 's solution stopped working somehow.

We posted in the discord there a new release coming out 20 Sept that has a fix for postoverflow whitelist as 1.5.3 had a bug.