mcollina / mows

Using MQTT.js in the browser over WebSocket -- Built with browserify!
72 stars 12 forks source link

UTF8 strings getting scrambled #4

Closed ldstein closed 10 years ago

ldstein commented 10 years ago

I'm currently testing Mows in the browser and having trouble with UTF8 strings which contain Chinese characters.

This issue is consistent in FF and Chrome on both Windows 8 and Ubuntu 13.04.

Steps to recreate:

  1. Start Mosca with mosca --http-port 1884 --http-bundle -v | bunyan
  2. Load the following HTML / JS script in Google Chrome:

index.html

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8"> 
    <script type="text/javascript" src="http://localhost:1884/mqtt.js" charset="utf-8"></script>
    <script type="text/javascript" src="client.js" charset="utf-8"></script>
</head>
<body>
</body>
</html>

client.js

var port = 1884;

var topic = '/foo';
var message = '好';

// Require MQTT if env is NodeJS
if(typeof module !== 'undefined' && module.exports)
{
    mqtt = require('mows');
}

var client = mqtt.createClient(port, 'localhost');

client.on('connect', function(){
    console.log('Connected');
    client.subscribe('/foo');
    client.publish(topic, message);
    console.log('Message Sent:', topic, "'" + message + "'");
});

client.on('message', function(topic, message){
    console.log('Message Received:', topic, "'" + message + "'");
});

This results in the following browser console:

Connected
Message Sent: /foo '好'
Message Received: /foo '}' 

Note that if client.js is run in Node, node client.js, the character is successfully sent and received.

Furthermore, if a Node client publishes "好", a subscribed browser client will successfully receive "好".

It sounds like a UTF-8 > ByteArray conversion error which is specific to when browser-side mows creates the payload. If I change the encoding to binary, mqtt.createClient(port, 'localhost', {encoding: 'binary'});, the following ByteArrays are generated:

Message "好" sent by NodeJS : <Buffer e5 a5 bd> Message "好" sent by Chrome : <Buffer 7d 00 00>

ldstein commented 10 years ago

Some further digging suggests this issue is related to Bops. See https://github.com/chrisdickinson/bops/issues/11

mcollina commented 10 years ago

Exactly, all the encoding is handled inside bops, so it's really not an issue here. I'm keeping this open to keep track of the problem.

ldstein commented 10 years ago

The UTF8 translation bug has been fixed in Bops v0.1.1

mcollina commented 10 years ago

Release MQTT.js v0.3.6