SkyAPM / aiops-engine-for-skywalking

This is an incubating repository of the Apache SkyWalking AIOps Engine
https://github.com/apache/skywalking/discussions/8883
Apache License 2.0
37 stars 7 forks source link

[Algorithm] Drain3 raw log parsing potential enhancement #9

Open Superskyyy opened 2 years ago

Superskyyy commented 2 years ago

Background: Drain log parsing works best on ingesting only log content - meaning we trim the rest with some simple Regex or rule. Slicing the contents accurately from

Dec 10 07:28:08 LabSZ sshd[24247]: Received disconnect from 112.95.230.3: 11: Bye Bye [preauth] to below requires prior knowledge on the delimiter, which I am 99% sure users don't care to give. So we need to adapt Drain to be more robust.

Received disconnect from 112.95.230.3: 11: Bye Bye [preauth] I found a potentially(?) major enhancement to the algorithm on RAW log parsing.

The current test is shown below yields much better clustering than the original unreadable results (over-convergence), but it also requires a tiny adjustment to global similarity threshold - So the idea is all clusters should have their own standard of accepting new templates, not by a global constraint. (This is mentioned in the updated version of research paper, not my invention)

I will attempt to submit a patch to the upstream IBM/Drain3 repo and see if it's accepted.

BUT! To yield the most accurate result, we still need to implement a dynamic threshold calculation and clustering merger for the similarity function;

ID=7     : size=177730    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=8     : size=141046    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*> <*>
ID=6     : size=122118    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*>
ID=2     : size=120488    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*>
ID=3     : size=35308     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*>
ID=5     : size=20241     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=1     : size=18909     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=4     : size=15645     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=9     : size=1331      : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*> <*> <*>
ID=11    : size=932       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*> [preauth]
ID=15    : size=497       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=10    : size=493       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> [preauth]
ID=13    : size=154       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*>
ID=18    : size=108       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*>
ID=20    : size=92        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=14    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=12    : size=14        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*>
ID=16    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=17    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=19    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>

Threshold 0.4 (Default, not best)

ID=10    : size=140950    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> password for <*> from <IP> port <NUM> ssh2
ID=9     : size=140701    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=7     : size=68958     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Connection closed by <IP> [preauth]
ID=8     : size=46608     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> [preauth]
ID=14    : size=37963     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM service(sshd) ignoring max retries; <NUM> > <NUM>
ID=12    : size=37298     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Too many authentication failures for <*> [preauth]
ID=13    : size=37029     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=11    : size=36967     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] message repeated <NUM> times <Averylonglist[]>
ID=6     : size=20241     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=4     : size=19852     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) check pass; user unknown
ID=1     : size=18909     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=2     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> from <IP>
ID=3     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> [preauth]
ID=5     : size=14356     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=18    : size=1289      : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=24    : size=952       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Read from socket failed Connection reset by peer [preauth]
ID=19    : size=930       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> No more user authentication methods available. [preauth]
ID=15    : size=838       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Did not receive identification string from <IP>
ID=17    : size=592       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Closed due to user request. [preauth]
ID=32    : size=497       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=20    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session opened for user <*> by (uid=<NUM>)
ID=22    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session closed for user <*>
ID=16    : size=177       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException Auth <*> [preauth]
ID=33    : size=108       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> [preauth]
ID=63    : size=92        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=47    : size=81        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Client disconnecting normally [preauth]
ID=52    : size=60        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnect [preauth]
ID=21    : size=34        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnected by user
ID=29    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> <*> [preauth]
ID=30    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=38    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user from <IP>
ID=39    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user [preauth]
ID=40    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user from <IP> port <NUM> ssh2
ID=31    : size=11        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> admin from <IP>
ID=35    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> <*> from <IP> port <NUM>
ID=37    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=41    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> from <IP> port <NUM>
ID=34    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal no hostkey alg [preauth]
ID=49    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error connect to <IP> port <NUM> failed.
ID=57    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnected by user [preauth]
ID=28    : size=5         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user myapn cen from <IP>
ID=46    : size=4         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user ftp <*> from <IP>
ID=56    : size=4         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> Authentication cancelled by user. [preauth]
ID=23    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Write failed Connection reset by peer [preauth]
ID=48    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=54    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on <IP> port <NUM>.
ID=55    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on port <NUM>.
ID=61    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Disconnect requested by Windows SSH Client.
ID=25    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> User request [preauth]
ID=36    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user back newshops from <IP>
ID=45    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user bash spm from <IP>
ID=51    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> org.vngx.jsch.userauth.AuthCancelException User authentication canceled by user [preauth]
ID=59    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user lcap oracle from <IP>
ID=60    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user zxdbm epg from <IP>
ID=26    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad packet length <NUM>. [preauth]
ID=27    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Packet corrupt [preauth]
ID=42    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user ram k from <IP>
ID=43    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> java.net.SocketTimeoutException Read timed out [preauth]
ID=44    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM>
ID=50    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Corrupted MAC on input. [preauth]
ID=53    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user sugon test from <IP>
ID=58    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException reject HostKey <IP> [preauth]
ID=62    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=64    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] syslogin perform logout logout() returned an error

Threshold 0.3 compared to below baseline result, looks almost perfect

--- Done processing file in 45.46 sec. Total of 655147 lines, rate 14411.4 lines/sec, 53 clusters
ID=10    : size=140950    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> password for <*> from <IP> port <NUM> ssh2
ID=16    : size=140668    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=7     : size=68958     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Connection closed by <IP> [preauth]
ID=8     : size=46608     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> [preauth]
ID=13    : size=37963     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM service(sshd) ignoring max retries; <NUM> > <NUM>
ID=12    : size=37298     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Too many authentication failures for <*> [preauth]
ID=11    : size=36967     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] message repeated <NUM> times <Averylonglist[]>
ID=47    : size=36803     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=6     : size=20241     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=4     : size=19852     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) check pass; user unknown
ID=1     : size=18909     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=2     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> from <IP>
ID=3     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> [preauth]
ID=5     : size=14356     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=18    : size=1289      : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=24    : size=952       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Read from socket failed Connection reset by peer [preauth]
ID=19    : size=932       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*> [preauth]
ID=14    : size=838       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Did not receive identification string from <IP>
ID=17    : size=592       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Closed due to user request. [preauth]
ID=31    : size=497       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=9     : size=259       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> user=root
ID=20    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session opened for user <*> by (uid=<NUM>)
ID=22    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session closed for user <*>
ID=15    : size=177       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException Auth <*> [preauth]
ID=32    : size=108       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> [preauth]
ID=52    : size=92        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=42    : size=87        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> <*> [preauth]
ID=46    : size=60        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnect [preauth]
ID=21    : size=34        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnected by user
ID=28    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> <*> from <IP>
ID=29    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> <*> [preauth]
ID=30    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=36    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user from <IP>
ID=37    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user [preauth]
ID=38    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user from <IP> port <NUM> ssh2
ID=34    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> <*> from <IP> port <NUM>
ID=35    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=39    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> from <IP> port <NUM>
ID=33    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal no hostkey alg [preauth]
ID=40    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> <*> <*> <*> <*> [preauth]
ID=44    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error connect to <IP> port <NUM> failed.
ID=23    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Write failed Connection reset by peer [preauth]
ID=43    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=48    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on <IP> port <NUM>.
ID=49    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on port <NUM>.
ID=50    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Disconnect requested by Windows SSH Client.
ID=25    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> User request [preauth]
ID=26    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad packet length <NUM>. [preauth]
ID=27    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Packet corrupt [preauth]
ID=41    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM>
ID=45    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Corrupted MAC on input. [preauth]
ID=51    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=53    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] syslogin perform logout logout() returned an error

Original version without my patch, but sliced with prior knowledge, threshold 0.4 default


--- Done processing file in 25.76 sec. Total of 655147 lines, rate 25432.1 lines/sec, 51 clusters
ID=10    : size=140768    : Failed password for <*> from <IP> port <NUM> ssh2
ID=9     : size=140701    : pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=7     : size=68958     : Connection closed by <IP> [preauth]
ID=8     : size=46642     : Received disconnect from <IP> <NUM> <*> <*> <*>
ID=14    : size=37963     : PAM service(sshd) ignoring max retries; <NUM> > <NUM>
ID=12    : size=37298     : Disconnecting Too many authentication failures for <*> [preauth]
ID=13    : size=37029     : PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=11    : size=36967     : message repeated <NUM> times <Averylonglist[]>
ID=6     : size=20241     : Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=4     : size=19852     : pam unix(sshd auth) check pass; user unknown
ID=1     : size=18909     : reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=2     : size=14551     : Invalid user <*> from <IP>
ID=3     : size=14551     : input userauth request invalid user <*> [preauth]
ID=5     : size=14356     : pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=18    : size=1289      : PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=24    : size=952       : fatal Read from socket failed Connection reset by peer [preauth]
ID=19    : size=932       : error Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*> [preauth]
ID=15    : size=838       : Did not receive identification string from <IP>
ID=17    : size=595       : Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*>
ID=31    : size=497       : Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=20    : size=182       : Accepted password for <*> from <IP> port <NUM> ssh2
ID=21    : size=182       : pam unix(sshd session) session opened for user <*> by (uid=<NUM>)
ID=22    : size=182       : pam unix(sshd session) session closed for user <*>
ID=16    : size=177       : error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException Auth <*> [preauth]
ID=32    : size=108       : Received disconnect from <IP> <NUM> [preauth]
ID=50    : size=92        : Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=42    : size=87        : Received disconnect from <IP> <NUM> <*> <*> <*> [preauth]
ID=46    : size=60        : Received disconnect from <IP> <NUM> disconnect [preauth]
ID=28    : size=30        : Invalid user <*> <*> from <IP>
ID=29    : size=30        : input userauth request invalid user <*> <*> [preauth]
ID=30    : size=30        : Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=36    : size=13        : Invalid user from <IP>
ID=37    : size=13        : input userauth request invalid user [preauth]
ID=38    : size=13        : Failed <*> for invalid user from <IP> port <NUM> ssh2
ID=34    : size=7         : Bad protocol version identification <*> <*> from <IP> port <NUM>
ID=35    : size=7         : Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=39    : size=7         : Bad protocol version identification <*> from <IP> port <NUM>
ID=33    : size=6         : fatal no hostkey alg [preauth]
ID=40    : size=6         : error Received disconnect from <IP> <NUM> <*> <*> <*> <*> [preauth]
ID=44    : size=6         : error connect to <IP> port <NUM> failed.
ID=23    : size=3         : fatal Write failed Connection reset by peer [preauth]
ID=43    : size=3         : error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=47    : size=3         : Server listening on <IP> port <NUM>.
ID=48    : size=3         : Server listening on port <NUM>.
ID=25    : size=2         : error Received disconnect from <IP> <NUM> User request [preauth]
ID=26    : size=1         : Bad packet length <NUM>. [preauth]
ID=27    : size=1         : Disconnecting Packet corrupt [preauth]
ID=41    : size=1         : Received disconnect from <IP> <NUM>
ID=45    : size=1         : Corrupted MAC on input. [preauth]
ID=49    : size=1         : Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=51    : size=1         : syslogin perform logout logout() returned an error
···
Superskyyy commented 2 years ago

Some additional information on log merger, we need to merge the clusters with very minimal number of log hits to similar clusters based on a similarity threshold calculated using some equation.. though IDK which yet, gonna figure that out using some heuristics.

Superskyyy commented 2 years ago

The algorithm enhancements have turned out to be effective. In light of issue #14 that requires further structural modification of the Drain3 code base, we have heavily modified the MIT-licensed Drain3 implementation and intend to host it in our repo (algorithm files will respect the original MIT license header).