[BR]: regex doesn't match on 0.11.2 but does on 0.11.1

brantleyp1 commented 1 year ago

Environment:

Freshly built/updated ubuntu 20.04 and ubuntu 22.04

# apt list -a fail2ban
Listing... Done
fail2ban/jammy,now 0.11.2-6 all [installed]

# apt list -a fail2ban
Listing... Done
fail2ban/focal,now 0.11.1-1 all [installed]

Both systems running stock Ubuntu repos, both are up to date with latest for their respective buildings

The issue:

I have a custom jail running for freeradius. Trying to get the filtering correct and noticed a difference between how fail2ban-regex handles the same config files between the two systems.

I'm focusing on ignoreregex at the moment.

Steps to reproduce

On both systems I have a test.log file containing:

Fri Oct 14 14:42:40 2022 : Auth: (46994) Login OK: [user] (from client all_ipv4 port 0 cli 192.232.180.194) SUCCESS, source-ip: 172.109.201.51, nas-ip: 172.109.201.51
Fri Oct 14 14:43:07 2022 : Auth: (47003) Login OK: [user] (from client all_ipv4 port 0 cli 206.166.196.171) SUCCESS, source-ip: 23.130.0.9, nas-ip: 23.130.0.9
Fri Oct 14 14:45:56 2022 : Auth: (47071) Login OK: [user] (from client all_ipv4 port 0 cli 68.63.84.183) SUCCESS, source-ip: 192.232.180.194, nas-ip: 192.232.180.194

A test config containing:

# Fail2ban filter for freeradius
[INCLUDES]
before = common.conf
[Definition]
failregex =     Auth: \(.*\) Login incorrect \(.*\): \[.*\] \(from client all_ipv4 port .* cli <HOST>\).*$
                Auth: \(.*\) Invalid user.*\(.*cli <HOST>\).*$
ignoreregex =   .*Auth:.*OK.*$

When I run fail2ban-regex test.log testing_filter.conf I get different results.

From 22.04:

root@radius:/etc/fail2ban/filter.d# fail2ban-regex test.log testing_filter.conf

Running tests
=============

Use   failregex filter file : testing_filter, basedir: /etc/fail2ban
Use      datepattern : {^LN-BEG} : Default Detectors
Use         log file : test.log
Use         encoding : UTF-8

Results
=======

Failregex: 0 total

Ignoreregex: 0 total

Date template hits:
|- [# of hits] date format
|  [3] {^LN-BEG}(?:DAY )?MON Day %k:Minute:Second(?:\.Microseconds)?(?: ExYear)?
`-

Lines: 3 lines, 0 ignored, 0 matched, 3 missed
[processed in 0.00 sec]

|- Missed line(s):
|  Fri Oct 14 14:42:40 2022 : Auth: (46994) Login OK: [user] (from client all_ipv4 port 0 cli 192.232.180.194) SUCCESS, source-ip: 172.109.201.51, nas-ip: 172.109.201.51
|  Fri Oct 14 14:43:07 2022 : Auth: (47003) Login OK: [user] (from client all_ipv4 port 0 cli 206.166.196.171) SUCCESS, source-ip: 23.130.0.9, nas-ip: 23.130.0.9
|  Fri Oct 14 14:45:56 2022 : Auth: (47071) Login OK: [user] (from client all_ipv4 port 0 cli 68.63.84.183) SUCCESS, source-ip: 192.232.180.194, nas-ip: 192.232.180.194
`-

from 20.04:

root@radius:/etc/fail2ban/filter.d# fail2ban-regex test.log testing_filter.conf

Running tests
=============

Use   failregex filter file : testing_filter, basedir: /etc/fail2ban
Use      datepattern : Default Detectors
Use         log file : test.log
Use         encoding : UTF-8

Results
=======

Failregex: 0 total

Ignoreregex: 3 total
|-  #) [# of hits] regular expression
|   1) [3] .*Auth:.*OK.*$
`-

Date template hits:
|- [# of hits] date format
|  [3] {^LN-BEG}(?:DAY )?MON Day %k:Minute:Second(?:\.Microseconds)?(?: ExYear)?
`-

Lines: 3 lines, 3 ignored, 0 matched, 0 missed
[processed in 0.00 sec]

|- Ignored line(s):
|  Fri Oct 14 14:42:40 2022 : Auth: (46994) Login OK: [user] (from client all_ipv4 port 0 cli 192.232.180.194) SUCCESS, source-ip: 172.109.201.51, nas-ip: 172.109.201.51
|  Fri Oct 14 14:43:07 2022 : Auth: (47003) Login OK: [user] (from client all_ipv4 port 0 cli 206.166.196.171) SUCCESS, source-ip: 23.130.0.9, nas-ip: 23.130.0.9
|  Fri Oct 14 14:45:56 2022 : Auth: (47071) Login OK: [user] (from client all_ipv4 port 0 cli 68.63.84.183) SUCCESS, source-ip: 192.232.180.194, nas-ip: 192.232.180.194
`-

Note- I get the same with or without the .* before Auth in ignoreregex. Also, even if I put an exact timestamp of one of the 3 lines, I still get 0 matches.

Expected behavior

I would expect the same results.

sebres commented 1 year ago

It is not a bug, but feature! Shortly - ignoreregex doesn't need to be applicated because no one of that messages matching your failregex. fail2ban-regex was rewritten to be more similar to fail2ban-server, so it uses the same filter-process as the server itself. Fail2ban (server) always had applied ignoreregex only if the message matching some failregex. With other words - simply consider the output with the missed lines, because failregex has higher precedence than ignoreregex.

brantleyp1 commented 1 year ago

Just curious then, is ignoreregex going away?

For instance I have an IP running a scan tool that I know generates failed logins, but I don't want it blocked, so instead of regex'ing for or \d+... I have the literal IP. Just hoping that functionality isn't going away.

Is ignoreregex processed before or after failregex?

sebres commented 1 year ago

Just curious then, is ignoreregex going away?

No. It doesn't. It is (and always was) applied by successful match with failregex (what did not matched in your case). Just fail2ban-regex has a bit different algorithm previously (which is now more similar to fail2ban-server processing).

Is ignoreregex processed before or after failregex?

After and only if it matches.

BTW, abovementioned filter doesn't really need ignoreregex, if failregex will be written more precise (anchored and without catch-all's), for instance

- failregex =     Auth: \(.*\) Login incorrect \(.*\): \[.*\] \(from client all_ipv4 port .* cli <HOST>\).*$
+ failregex =     ^Auth: \([^\)]*\) Login incorrect \([^\)]*\): \[[^\]]*\] \(from client [^\)]* cli <HOST>\)

Furthermore the ignoreregex .*Auth:.*OK.*$ is extremely vulnerable and if for instance somewhere in foreign user input of incorrect login or invalid user messages will be word OK, it would mistakenly match, so would be estimable for an intruder and he can use in further requests, so be able to continue attack without to get ban.

For instance I have an IP running a scan tool that I know generates failed logins, but I don't want it blocked, so instead of regex'ing for or \d+... I have the literal IP.

Don't confuse ignoreip and ignoreregex - they are different and processed differently.

Additionally note that it is always possible to create a filter without to use ignoreregex at all. Either using more precise prefregex or failregex or using negative lookahead and lookbehind in the prefregex or failregex. The ignoreregex is an atavism and it is retained for backwards compatibility reasons.

brantleyp1 commented 1 year ago

In general I think I just learned more about how f2b works than my trail and error over the last month, thank you.

As for the ^Auth anchor, I assumed I needed to leave off the anchor since that is after the timestamp.

Again, thank you for your time.

sebres commented 1 year ago

As for the ^Auth anchor, I assumed I needed to leave off the anchor since that is after the timestamp.

Well, not really - firstly the timestamp is cut out after matching of datepattern, so the anchor is totally OK (or probably something like ^\s*Auth to allow optional spaces after the timestamp). And secondly without the anchor it remains "vulnerable" due to possible foreign input (if it is imaginable, for instance in referrer or agent strings) but at least slow, because without anchor fail2ban would search the match everywhere in string (which can be really long in the access-log, considering repeat of all the branches of RE on every part of string). Still again - the anchor must be mandatory.

fail2ban / fail2ban