emqx / MQTTX

A Powerful and All-in-One MQTT 5.0 client toolbox for Desktop, CLI and WebSocket.
https://mqttx.app
Apache License 2.0
3.93k stars 449 forks source link

[Bug] saved binary data is incorrect #1806

Open calle2010 opened 1 week ago

calle2010 commented 1 week ago

What did I do

I try to read binary data from a channel and save it with "--file-write".

./mqttx-cli-macos-x64 sub -h 192.168.xxx.xxx --format binary --output-mode clean --file-write ew10.netp -t home/ew10/state >ew10.json

What happened

The output in "ew10.netp" contains extraneous bytes and doesn't match the clean output in "ew10.json", which has the correct data:

$ hexdump -C ew10.netp | head -3 
00000000  1b 77 00 00 00 04 ef bf  bd 01 01 07 0a 1b 78 00  |.w............x.|
00000010  00 00 05 ef bf bd 03 02  00 68 0a 1b 79 00 00 00  |.........h..y...|
00000020  04 ef bf bd 01 01 07 0a  1b 7a 00 00 00 0f ef bf  |.........z......|
head -40 ew10.json        
{
  "topic": "home/ew10/state",
  "payload": {
    "type": "Buffer",
    "data": [
      43,
      84,
      0,
      0,
      0,
      4,
      247,
      1,
      1,
      7
    ]
  },
  "packet": {
    "cmd": "publish",
    "retain": false,
    "qos": 0,
    "dup": false,
    "length": 28,
    "topic": "home/ew10/state",
    "payload": {
      "type": "Buffer",
      "data": [
        43,
        84,
        0,
        0,
        0,
        4,
        247,
        1,
        1,
        7
      ]
    }
  }

Expected

It is expected that the binary output has the bytes that can be found in the clean output.

Environment

More detail

It seems to me that there is a conversion going on and somehow the binary data is interpreted as UTF-8, which fails. Any non-UTF-8-compliant byte sequences would be replaced with ef bf bd. This sequence can be found multiple times in the output.

Other "ASCII compatible" byte sequences like "00 00 00 04" or "01 01 07" can be found in the output.

ysfscream commented 1 week ago

Hi! You're encountering an issue where --format binary isn't being properly handled when writing to .netp file. Here's a temporary solution using MQTTX CLI's pipeline capabilities:

./mqttx-cli-macos-x64 sub -h 192.168.xxx.xxx -t home/ew10/state --output-mode clean | jq -r '.packet.payload.data | join(",")' | xxd -r -p > ew10.netp

This command:

  1. Subscribes in clean output mode
  2. Extracts payload data with jq
  3. Converts back to binary via xxd

The root cause is likely that the system doesn't recognize .netp as a binary format extension.

calle2010 commented 1 week ago

Interesting. I’d never expected the file extension to make a difference. Specifying format binary should eliminate all guessing done by the tool.

I don’t even know if netp is a well known extension. I just picked it because it was used in the device I wanted to monitor.

ysfscream commented 1 week ago

Yes, perhaps we will reverse the --format binary logic, so only when utf-8 is specified will it be displayed as text file data. Everything else should be treated as binary.