gcla / termshark

A terminal UI for tshark, inspired by Wireshark
MIT License
8.99k stars 392 forks source link

Warning: Could not decode PDML data: ... "illegal character code" #133

Open clort81 opened 2 years ago

clort81 commented 2 years ago

Prerequisites

Please verify these before submitting an issue.

Package: termshark Version: 2.2.0-2

Yes

Yes

Problem

Running sudo termshark -i [interface] works then displays warning box: "Could not decode PDML data: XML syntax error on line 78925: illegal character code U+0006."

Current Behavior

Running sudo termshark -i [interface] works then displays warning box: "Could not decode PDML data: XML syntax error on line 78925: illegal character code U+0006."

Expected Behavior

No warning popup box.

Screenshots as applicable

Steps to Reproduce

Run termshark -i eth0

Context

Please provide the complete output of these commands:

termshark -v termshark 2.2.0

Please also provide any relevant information about your environment (OS, VM, pi,...)

Devuan ceres, aarch64

clort81 commented 2 years ago

the error appears in pcap/loader.go

                Loop:
                        for {
                                tok, err := d.Token()
                                if err != nil {
                                        if !issuedKill && unexpectedPdmlError(err) {
                                                err = fmt.Errorf("Could not read PDML data: %v", err)
                                                issuedKill = true
                                                pdmlCancelFn()                                                          
                                                HandleError(PdmlCode, app, err, cb)
                                        }
                                        if err == io.EOF {
                                                readAllRequiredPdml = true
                                        }
                                        break

not knowing go, i haven't been able to disable it yet.

gcla commented 2 years ago

Hi @clort81 - yes you're right, that's the source of the message that termshark emits. The problem seems to come from invalid XML generated by tshark in some circumstances. I saw it most recently working with telnet. If you download this pcap, you can see the invalid XML by running this command:

https://drive.google.com/file/d/1B3NJv8oOARlY7aztkVNA8oB4SYGFfas3/view?usp=sharing

$ tshark -r zork.pcap -T pdml | xmllint --noout -
-:1485: parser error : invalid character in attribute value
ield name="telnet.data" showname="Data: �\030\001" size="3" pos="40" show="�
                                                                               ^
-:1485: parser error : attributes construct error
ield name="telnet.data" showname="Data: �\030\001" size="3" pos="40" show="�
                                                                               ^
...

These characters fail isInCharacterRange() from the Go stdlib's xml.go. I'm not certain about this diagnosis yet though...

I could suppress the message but the problem really is that the XML parsing breaks at this point. While I look more closely, here's a crummy workaround:

#!/usr/bin/env bash

if [[ " $* " =~ " pdml " ]]; then
    exec tshark "$@" | tr -cd '\11\12\15\40-\176'
else
    exec tshark "$@"
fi
[main]
  tshark = "/usr/local/bin/tshark-hack"

Let me know if that doesn't work :-)

gcla commented 2 years ago

Here's a Wireshark merge-request to fix this at the source:

https://gitlab.com/wireshark/wireshark/-/merge_requests/7398