kellyjonbrazil / jc

CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.
MIT License
7.92k stars 210 forks source link

Netstat issue with spaces in program name #578

Open kellyjonbrazil opened 4 months ago

kellyjonbrazil commented 4 months ago

Hello @kellyjonbrazil , I have found a bug that got to the implementation by adding the 7 to the list of TCP / UDP states. I would like to help you to resolve it, I just want to ask - is it better trying to "reopen" this MergeRequest and related issue or creating a new one?

Originally posted by @HervisDaubeny in https://github.com/kellyjonbrazil/jc/issues/447#issuecomment-2199942177

HervisDaubeny commented 4 months ago

While running our monitoring script which uses jc to parse netstat output on linux, it started throwing "index out of range" errors. I added some debug printing to jc/parsers/netsat_linux.py to see what is going on.

First one was to see the data comming in

def parse_network(headers, entry):
    LIST_OF_STATES = [
        "ESTABLISHED", "SYN_SENT", "SYN_RECV", "FIN_WAIT1", "FIN_WAIT2",
        "TIME_WAIT", "CLOSED", "CLOSE_WAIT", "LAST_ACK", "LISTEN", "CLOSING",
        "UNKNOWN", "7"
    ]

    # split entry based on presence of value in "State" column
    contains_state = any(state in entry for state in LIST_OF_STATES)
    split_modifier = 1 if contains_state else 2

    ### DEBUG PRINT ###
    print(entry)
    entry = entry.split(maxsplit=len(headers) - split_modifier)

output:

tcp        0      0 0.0.0.0:9898            0.0.0.0:*               LISTEN      1178/pgpool
tcp        0      0 192.168.68.116:9102     0.0.0.0:*               LISTEN      584/bareos-fd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      600/sshd: /usr/sbin
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN      1178/pgpool
tcp        0      0 0.0.0.0:5433            0.0.0.0:*               LISTEN      1676/postgres
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      928/exim4
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      1817931/zabbix_agen
tcp        0      0 0.0.0.0:9000            0.0.0.0:*               LISTEN      1187/pgpool: watchd
tcp6       0      0 :::9898                 :::*                    LISTEN      1178/pgpool
tcp6       0      0 :::22                   :::*                    LISTEN      600/sshd: /usr/sbin
tcp6       0      0 :::5432                 :::*                    LISTEN      1178/pgpool
tcp6       0      0 :::5433                 :::*                    LISTEN      1676/postgres
tcp6       0      0 :::10050                :::*                    LISTEN      1817931/zabbix_agen
udp        0      0 0.0.0.0:68              0.0.0.0:*                           535/dhclient
udp        0      0 0.0.0.0:37300           0.0.0.0:*                           1204/pgpool: heartb
udp        0      0 0.0.0.0:9694            0.0.0.0:*                           1205/pgpool: heartb
udp        0      0 0.0.0.0:9694            0.0.0.0:*                           1203/pgpool: heartb
udp        0      0 0.0.0.0:44649           0.0.0.0:*                           1206/pgpool: heartb
udp        0      0 0.0.0.0:52868           0.0.0.0:*                           494/rsyslogd

The interesting part comes here, It is clear that none of the udp processes has value in the STATE column, howerver after running the script with this debug print added, we can see the error:

def parse_network(headers, entry):
    LIST_OF_STATES = [
        "ESTABLISHED", "SYN_SENT", "SYN_RECV", "FIN_WAIT1", "FIN_WAIT2",
        "TIME_WAIT", "CLOSED", "CLOSE_WAIT", "LAST_ACK", "LISTEN", "CLOSING",
        "UNKNOWN", "7"
    ]

    # split entry based on presence of value in "State" column
    contains_state = any(state in entry for state in LIST_OF_STATES)
    split_modifier = 1 if contains_state else 2

    entry = entry.split(maxsplit=len(headers) - split_modifier)
    ### DEBUG PRINT ###
    print(entry, len(entry))
['tcp', '0', '0', '0.0.0.0:9898', '0.0.0.0:*', 'LISTEN', '1178/pgpool         '] 7
['tcp', '0', '0', '192.168.68.116:9102', '0.0.0.0:*', 'LISTEN', '584/bareos-fd       '] 7
['tcp', '0', '0', '0.0.0.0:22', '0.0.0.0:*', 'LISTEN', '600/sshd: /usr/sbin '] 7
['tcp', '0', '0', '0.0.0.0:5432', '0.0.0.0:*', 'LISTEN', '1178/pgpool         '] 7
['tcp', '0', '0', '0.0.0.0:5433', '0.0.0.0:*', 'LISTEN', '1676/postgres       '] 7
['tcp', '0', '0', '127.0.0.1:25', '0.0.0.0:*', 'LISTEN', '928/exim4           '] 7
['tcp', '0', '0', '0.0.0.0:10050', '0.0.0.0:*', 'LISTEN', '1817931/zabbix_agen '] 7
['tcp', '0', '0', '0.0.0.0:9000', '0.0.0.0:*', 'LISTEN', '1187/pgpool: watchd '] 7
['tcp6', '0', '0', ':::9898', ':::*', 'LISTEN', '1178/pgpool         '] 7
['tcp6', '0', '0', ':::22', ':::*', 'LISTEN', '600/sshd: /usr/sbin '] 7
['tcp6', '0', '0', ':::5432', ':::*', 'LISTEN', '1178/pgpool         '] 7
['tcp6', '0', '0', ':::5433', ':::*', 'LISTEN', '1676/postgres       '] 7
['tcp6', '0', '0', ':::10050', ':::*', 'LISTEN', '1817931/zabbix_agen '] 7
['udp', '0', '0', '0.0.0.0:68', '0.0.0.0:*', '535/dhclient        '] 6
['udp', '0', '0', '0.0.0.0:37300', '0.0.0.0:*', '1204/pgpool:', 'heartb '] 7
['udp', '0', '0', '0.0.0.0:9694', '0.0.0.0:*', '1205/pgpool: heartb '] 6
['udp', '0', '0', '0.0.0.0:9694', '0.0.0.0:*', '1203/pgpool: heartb '] 6
['udp', '0', '0', '0.0.0.0:44649', '0.0.0.0:*', '1206/pgpool: heartb '] 6
['udp', '0', '0', '0.0.0.0:52868', '0.0.0.0:*', '494/rsyslogd        '] 6

The row with the heartb process gets parsed incorrectly, because it is running on port :37300. The number 7, which is one of the recognized states after the update, gets picked up as delimetter and breaks the formatting of the row with 1204/pgpool: ending up in the STATE column and heartb in PID/Program name

The "index out of range error" I'm getting is because I take the output of jc and split the column PID/Program name on '/' and access both parts of it. In this corner case however, it breaks the script, because of the parsing error I described higher. (since there is no / in the column)

I was localy able to fix the problem when I removed the number 7 from the LIST_OF_STATES. I remember tho, that without the 7 present in it, some tests of yours were failing.

What I would like to do, is to remove the number 7 from LIST_OF_STATES because according to the documentation it is not a valid state that can appear as output of the netstat command, and to try and debug why the tests are failing. I will update you on the progress. Is this approach okay with you?

kellyjonbrazil commented 4 months ago

Oh, dear - that's a nasty little bug. Thanks for looking into this - looking forward to working with you on this!

kellyjonbrazil commented 2 months ago

@HervisDaubeny any update on this? If not, I can take a look. Thanks!

HervisDaubeny commented 2 months ago

Hello, unfortunately I'm currently flooded with work from every side. If you don't mind waiting I would love to look into it, but I can't promise you I will be able to do it soon. My estimate is that I could get to it in november.

Best regards, Sebastian

Message ID: @.***>

Luigi31415 commented 1 month ago

Description

The current code for checking if an entry contains a specific state has a bug where partial matches can mistakenly trigger the condition. The current implementation:

contains_state = any(state in entry for state in LIST_OF_STATES)

does not enforce word boundaries, which means that it can match unintended portions of text, like matching "7" within "0.0.0.0:37300".

Suggested Fix

One potential quick fix to enforce word boundaries is to modify the code as follows:

contains_state = any(re.search(rf"\b{re.escape(state)}\b", entry) for state in LIST_OF_STATES)
kellyjonbrazil commented 1 month ago

What I would like to do, is to remove the number 7 from LIST_OF_STATES because according to the documentation it is not a valid state that can appear as output of the netstat command, and to try and debug why the tests are failing. I will update you on the progress. Is this approach okay with you?

It looks like the tests fail because we do see 7 as state output in some samples. For example this one on Ubuntu 18.0.4:

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:42351         0.0.0.0:*               LISTEN      1112/containerd     
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      885/systemd-resolve 
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1127/sshd           
tcp6       0      0 :::22                   :::*                    LISTEN      1127/sshd           
udp        0      0 127.0.0.53:53           0.0.0.0:*                           885/systemd-resolve 
udp        0      0 192.168.71.131:68       0.0.0.0:*                           867/systemd-network 
raw6       0      0 :::58                   :::*                    7           867/systemd-network 
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     Path
unix  2      [ ACC ]     SEQPACKET  LISTENING     20812    1/init               /run/udev/control
unix  2      [ ACC ]     STREAM     LISTENING     33765    1723/systemd         /run/user/1000/systemd/private
unix  2      [ ACC ]     STREAM     LISTENING     33808    1723/systemd         /run/user/1000/gnupg/S.gpg-agent.ssh
unix  2      [ ACC ]     STREAM     LISTENING     33809    1723/systemd         /run/user/1000/gnupg/S.dirmngr
unix  2      [ ACC ]     STREAM     LISTENING     33810    1723/systemd         /run/user/1000/gnupg/S.gpg-agent.browser
unix  2      [ ACC ]     STREAM     LISTENING     33811    1723/systemd         /run/user/1000/gnupg/S.gpg-agent
unix  2      [ ACC ]     STREAM     LISTENING     33812    1723/systemd         /run/user/1000/gnupg/S.gpg-agent.extra
unix  2      [ ACC ]     STREAM     LISTENING     20655    1/init               /run/systemd/private
unix  2      [ ACC ]     STREAM     LISTENING     20662    1/init               /run/lvm/lvmetad.socket
unix  2      [ ACC ]     STREAM     LISTENING     20664    1/init               /run/systemd/journal/stdout
unix  2      [ ACC ]     STREAM     LISTENING     20891    1/init               /run/lvm/lvmpolld.socket
unix  2      [ ACC ]     STREAM     LISTENING     27473    1/init               /run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING     27443    1/init               /run/snapd.socket
unix  2      [ ACC ]     STREAM     LISTENING     27445    1/init               /run/snapd-snap.socket
unix  2      [ ACC ]     STREAM     LISTENING     27475    1/init               /run/uuidd/request
unix  2      [ ACC ]     STREAM     LISTENING     27481    1/init               /var/run/docker.sock
unix  2      [ ACC ]     STREAM     LISTENING     27489    1/init               /var/run/dbus/system_bus_socket
unix  2      [ ACC ]     STREAM     LISTENING     27468    1/init               /var/lib/lxd/unix.socket
unix  2      [ ACC ]     STREAM     LISTENING     30726    1112/containerd      /run/containerd/containerd.sock
unix  2      [ ACC ]     STREAM     LISTENING     27436    1/init               @ISCSIADM_ABSTRACT_NAMESPACE
unix  2      [ ACC ]     STREAM     LISTENING     25548    607/VGAuthService    /var/run/vmware/guestServicePipe

Looks like we'll have to tighten up the splitting logic per @Luigi31415 's comment above.

Luigi31415 commented 1 month ago

@kellyjonbrazil I think my suggested fix does solve the problem. Let me know if you agree so that I can put in a PR.

kellyjonbrazil commented 1 month ago

@Luigi31415 sounds good - go ahead and use dev as a base. Thanks!

kellyjonbrazil commented 3 days ago

Merged in https://github.com/kellyjonbrazil/jc/pull/608. This fix will go into the next release.