[Bug] Message published as plain text instead of base64

edlundin commented 2 years ago

What did I do

I wrote a plain text message and selected the payload format to "Base64". My message is encoded in base64, and I publish it.

What happened

While the GUI show the published message in base64, the subscribers (not only MQTTX) receive it in plain text.

Expected

The subscribers should receive this message in base64.

Environment

OS: macOS Monterey 12.5
MQTTX version: v1.8.2

More detail

Here's a screenshot of what I see after publishing the same message in base64 and plain text:

screenshot

ysfscream commented 2 years ago

Hi, thanks for your use. Maybe I don't think it's a bug? Base64 is just the encoding format in which you send the payload, the messages are binary during transmission.

However, when the data is received, it is displayed as plaintext because there is another formatting decode done at the time of receiving.

edlundin commented 2 years ago

Hi,

My screenshot and description should've been more explicit.

I am trying to highlight that by selecting "base64" (from the drop-down menu at the bottom of the screenshot, followed by "QoS", "Retain", and "Meta"), the message in the text box is encoded. However, MQTTX uses the plain text version instead of the encoded one when publishing.

Furthermore, the selected decoder is not at fault since a subscription from another client (e.g. my phone) receives the same plain text message instead of the base64 one.

The same behaviour happens when using "Hex" instead of "Base64".

I do not know what the intent was when implementing this feature. What I feel, however, from a UX point of view, is that when updating a user-inputted text following the action of a component (selection from a drop-down menu, click of a button, ...), I do expect this updated input to be used instead of the original one.

ysfscream commented 2 years ago

@edlundin Sorry for my late reply. Your question here made me rethink this feature.

What I understand here is that just the current data is used as a tool for data conversion, for example when you have a hello text, you can convert it to hex or base64, etc. You can also convert this data to plaintext when you receive it. In MQTT messaging, whatever content is sent will be converted to binary. So if a hello string is displayed in base64, e.g. aGVsbG8=, and the decode is plaintext at the time of reception, I think it is normal to see a hello

So, what do you think is the correct way to use it here, or describe your correct needs, and sorry if it is my relevant knowledge that is missing, but feels free to add and discuss.

edlundin commented 2 years ago

Hi,

No worries, I'll try to explain how I see it, if it is still too abstract I'll try to make a diagram or something.

If I remember correctly, per the MQTT specification, the payloads are UTF-8 strings. Since it only matters when using characters outside the ASCII table, the following examples employ ASCII-compatible text. Their binary form is in hexadecimal for clarity's sake.

Assuming the following message "HelloWorld!", to be published on the topic "testtopic/data". The selected encoder, as well as the decoder, is "Plain text".

With this configuration, the encoder only encodes "HelloWorld!" to UTF-8 (or gets it from the text box in this format). In its binary form, "HelloWorld!" looks like "48656c6c6f576f726c6421". The subscribers will receive "48656c6c6f576f726c6421", and because the decoder is "Plain text", no further action is taken but to interpret it as a UTF-8 string. Hence, "48656c6c6f576f726c6421" becomes "HelloWorld!" again and is shown to the user.

On the other hand, with the encoder set to "Base64" and the decoder left to "Plain text", there is an extra step when publishing the message: "48656c6c6f576f726c6421" should be encoded in base64 before the publication, becoming "534756736247395862334a735a43453d". Looking at this in UTF-8, "HelloWorld!" becomes "SGVsbG9Xb3JsZCE=". Like in the first example, the subscribers receive the payload, this time the base64-encoded one: "534756736247395862334a735a43453d". After interpreting it as a UTF-8 string, "SGVsbG9Xb3JsZCE=" is shown to the user.

If the decoder is "Base64", then before interpreting the payload as a UTF-8 string, it should be assumed as a base64 one and decoded. The result is then interpreted as a UTF-8 string and shown to the user, "534756736247395862334a735a43453d" has been resolved to "HelloWorld!".

I am trying to illustrate that in its current version, the application publishes the input of the user, ignoring the selected encoder. If I input "HelloWorld!" and then select the "Base64" encoder, while the UI is updated, the payload will still contain the binary form of "HelloWorld!". However, it should be the binary form of "SGVsbG9Xb3JsZCE=": the user input, encoded in base64.

TLDR I see the encoder (publish) as a rapid translation tool from an input encoding (e.g. plain text) to another (e.g. base64). Furthermore, I see the "Plain text" encoder/decoder as a passthrough: nothing is altered, receiving the publication in the same "format". The decoder (subscribe) performs the reverse operation (decoding base64, interpreting hex as UTF-8, ...). From what I understand, the decoder is currently an encoder for the subscription.

ysfscream commented 1 year ago

Sorry I was a bit busy before, I didn't reply in time, thanks for your very detailed explanation, I actually understand what it means, that is, if you send Base64 you should actually receive the Base64 display as well, right?

But I'm thinking that trying to get the user to select the data format they want to display manually, perhaps I need a switch to make the encoder type on publishing and the encoder type on receiving a synchronization? I'm not sure, but I ran into the same issue that was raised by another user, so I'm not quite sure what I need to do now 😅 Currently this feature is more like a translation...

edlundin commented 1 year ago

That's right.

If you wish to keep the current feature while implementing the other, could a switch labelled "preview" when encoding do the trick?

When decoding, you could argue that you might want to change the decoder on a per message basis or at least not only at reception but also in the message list.

I looked at the code base to see where those changes would occur, if at all, and I am somewhat seeing where everything should go. However, I never used Vue.js, nor did I ever work on front-end stuff. Thus implementing it would require a bit of learning and experimentation on my part.

ysfscream commented 1 year ago

That sounds like a good idea, I'll think about it, and your contribution is welcome. : )

emqx / MQTTX