\x00 in messages is not correctly handled

BonsaiDen commented 13 years ago

It seems there's something wrong with your handling of the packets, I'm using binary encoded stuff for one of my projects(http://github.com/BonsaiDen/NodeGame-Shooter) and after the first \x00 in the data the message gets terminated, even though it hasn't reached the closing \ff yet. It works fine in the native implementations.

Example of a Message:
\x00TestFoblablaFoo\x00Bla Test\xff

The expected data would be TestFoblablaFoo\x00Bla Test but actually it's TestFoblablaFoo.

gimite commented 13 years ago

Yeah that's known issue. Cause of the bug is a bug (or specification) of Flash Socket class, if I remember correctly. It should be possible to implement workaround, though. So you can either change your code not to use \x00, or implement the workaround and send me patch :)

BonsaiDen commented 13 years ago

Hm, that's strange least under Adobe Air flash.net.Socket works as expected and returns the \x00 bytes.

gimite commented 13 years ago

It depends on what method you use. readUTFBytes() throws away after \x00, if I remember correctly.

BonsaiDen commented 13 years ago

Got it to work using readByte code from my AIR thingy follows :)

socket.addEventListener(air.ProgressEvent.SOCKET_DATA, function(e) {
    // Grab the bytes
    while (socket.bytesAvailable > 0) {
        var b = socket.readByte();
        bytes.push(b < 0 ? 256 + b : b);
    }

    // find the header...
    if (!header) {
        var line = [];  
        for(var i = 0; i < bytes.length; i++) {
            line.push(bytes.shift());
            if (line[0] == 13 && line[1] == 10
                && line[2] == 13 && line[3] == 10) {
                header = true;
                break;
            }
            if (line.length > 3) {
                line.shift();
            }        
        }
        if (header) {
            air.trace('header');
            that.onopen();
        }

    }

    // read the packets
    if (header) {
        while ((end = bytes.indexOf(255)) != -1) {
            var part = bytes.slice(0, end + 1);
            part.shift();
            part.pop();

            var str = '';
            for(var i = 0, l = part.length; i < l; i++) {
                str += String.fromCharCode(part[i]);
            }  

            // decode the utf8 bytes
            str = decodeURIComponent(escape(str));
            that.onmessage({'data': str});
            bytes.splice(0, end + 1);  
        }
    }
});

BonsaiDen commented 13 years ago

OK fix is on the way, currently it works in Chrome and Firefox. Opera fails because it's implementation of decodeURIComponent is broken and truncates the string after %00 and IE... well I'm already swearing like crazy about this whole mess.

BonsaiDen commented 13 years ago

OK, Opera is working, it's not decodeURIComponent that's broken it's the MessageEvent which doesn't like \x00 bytes. Gonna give it a try in IE now...

gimite commented 13 years ago

decodeURIComponent(escape(str)) is nice trick but it looks a bit tricky. It might be better to use readUTFBytes() as I did, but call it for each chunk splitted by "\x00". It looks more efficient too.

BonsaiDen commented 13 years ago

OK, here's a version without the decode trick, can't imagine any way how this could be made any faster/more efficient.

      var data:String = "";
      for(var i:int = 1, zeroByte:String = String.fromCharCode(0x00); i <= pos; i++) {
        if (buffer[i] == 0x00) {
          data += buffer.readUTFBytes(pos - buffer.position) + zeroByte;
          buffer.position = i + 1;

        } else if (buffer[i] == 0xff) {
          data += buffer.readUTFBytes(buffer.bytesAvailable - 1);
        }
      }

BonsaiDen commented 13 years ago

See: http://github.com/BonsaiDen/web-socket-js/commit/1f87fc2dd73b5a8a1211de076d308f78de23ef79

gimite commented 13 years ago

Thanks for the patch. Pulled with a bug fix.

kanaka commented 13 years ago

I would suggest making this an optional code-path. I made pretty much the same change in order to pass binary data and it was MUCH slower, so I decided not to use the patch. In fact, I discovered that it was actually faster to just base64 encode in the server and then base64 decode in Javascript. That was actually faster than doing byte-by-byte reads in actionscript to catch \x00.

gimite commented 13 years ago

Is it really slow? When the data doesn't contain \x00, additional cost should be just:

for(var i:int = 1, ; i < pos; i++) { check if (buffer[i] == 0x00) }

which doesn't look like slow operation... And it should be twice as slow as original code in worst case, because we already do the same thing for \xff.

BonsaiDen commented 13 years ago

Hm, any benchs? I somehow can't believe that the loop + string concatenation in AS would be slower than doing a whole base64 decode in JavaScript. Using base64 would also, at least in my case, defeat one of the purposes of using binary stuff, which is to reduce traffic on the server side.

gimite commented 13 years ago

kanaka, I guess you refer to this: http://github.com/gimite/web-socket-js/issues/closed#issue/18 and the code you showed there calls buffer.readUTFBytes() for each character. Note that the patch by BonsaiDen is different that it calls readUTFBytes() for each chunk splitted by \x00 i.e. it is called just once as original code if the message doesn't contain any \x00.

kanaka commented 13 years ago

Yes, that's true that yours is a different (and better) solution.

I will still note however, this is THE inner loop so any performance change is going to have a significant impact on overall performance. You've not only doubled the number of times that all the data is scanned, you have replace a single call to the readUTFBytes built-in (likely optimized C++ code), with a bunch of AS operations (in addition to 1 or more calls to the readUTFBytes built-in).

I also suspect that string concatenations in AS result in garbage collection as they do in Javascript which will introduce more CPU overhead and jitter.

Note that noVNC can easily receive several MB of data within a second or two, so any slow-downs in main code paths are very obvious. I might not get to testing for a few days, but I'll let you know what I find. If it does impact performance significantly, I still think having an option to turn off '\x00' detection (when you know it won't happen) would be a good idea.

A couple of points of interest:

It's somewhat debatable what '\x00' means in the context of UTF-8 strings. readUTFBytes has given it the meaning of end-of-string. Regardless, readUTFBytes is just plain broken: it always advances the buffer's position variable by the amount requested regardless of the amount returned which is just wrong, wrong, wrong. Anybody want to file a bug with Adobe?
A lot of this may become irrelevant if the pending WebSockets standard gets adopted. It completely changes the protocol framing so that instead of a '\x00' prefix and a '\xff' suffix, there is a '\xff' prefix, followed by an 8 byte length. As noted in the new spec, this removes much of the need for buffering in clients and servers which would be great for web-socket-js. Fingers crossed: http://www.whatwg.org/specs/web-socket-protocol/
If I'm reading the profiler right, it appears that under Chrome, noVNC spends about 40% of it's time in garbage collection. Unfortunately firefox's profiler (via firebug) doesn't show the garbage collection time (or rolls it into other functions) so I can't tell how this changes when using web-socket-js. So even if the web-socket-js changes don't affect performance standalone, the extra CPU usage may still affect the application's performance by reducing the available CPU time.
The Javascript base64 decode algorithm in noVNC is just as fast (and faster in some cases) than the window.atob built-in Javascript function. I've spent hours searching for base64 decode optimizations but not much progress there. I have a new solution that is about 10% faster but it's more complicated and I haven't decided if that's worth it (base64 decode is only about 10-20% of the time in noVNC anyways).
web-socket-js already slows down noVNC by quite a bit. There are some optimizations to web-socket-js that I've been thinking about that hopefully I'll get around to testing in the near future, but we'll see. For example, instead of scanning for '\x00', you could do the readUTFBytes command, and then check the length of the string to see if it increased by the amount requested, if not then there was a '\x00' so back the position up by that amount and try again until we've read the amount we want.

BonsaiDen commented 13 years ago

Concering the points you've made. Yes the new draft for WebSocket will fix quite a lot of stuff, I'm also looking forward to this one, but it also would be great if at least the guys over at Opera could get their ...... up and implement WebSockets at all. IE, well I guess we all know that with the current release schedule we'll have to wait at least 2 years until MS brings out IE10. Not even speaking of the adoption of those browsers.

On your last point, wonder if that try/catch approach will work, gonna give it a try though. We would have way less problems without the forced UTF8 encoding.

gimite commented 13 years ago

For example, instead of scanning for '\x00', you could do the readUTFBytes command, and then check the length of the string to see if it increased by the amount requested, if not then there was a '\x00' so back the position up by that amount and try again until we've read the amount we want.

I'm not sure whether there is easy way to get number of bytes in the string (note that length is number of characters). If we need to convert back to byte array to do that, it may be slower than current implementation (not sure until trying that, though).

BonsaiDen commented 13 years ago

The only way to get the number of bytes that were actually read from the array, would be to convert the UTF8 string back to bytes, everything else is out of question. The whole ByteArray thingy is - in my opinion - broken beyond repair. Even readMultiByte(length, 'ascii') stops after the first 0x00, it's not like Java/ActionScript wouldn't allow for 0x00 in strings, must have been a C guy who implemented this stuff.

So we have only one option here I think, that is to do some performance tests and make the 0x00 byte stuff optional.

kanaka commented 13 years ago

Sigh. You're right. I had momentarily forgotten that we are dealing with variable length multi-byte UTF-8.

Well, you could at least short circuit the scan by 50% by doing a readUTFBytes read first, then scanning from length returned instead of the beginning. I.e. if readUTFBytes(100) returns 60 characters then you know that at least the first 60 bytes of buffer are not '\x00' so you can skip them.

BonsaiDen commented 13 years ago

Then you would still have the problem that you don't know whether you just got lots of multi byte encoded utf8 characters or just something with a 0x00 in it.

Example, let's assume that BA are the two bytes of an encoded utf8 char Ä, and we supply 5 as the length parameter to readUTFBytes:

BABABA0xff > ÄÄÄ > length 3
cat0x00test > cat > length 3

See the problem? There's no direct way of figuring out whether we just hit a 0x00 or some multi byte characters.

kanaka commented 13 years ago

I think you misunderstood what I was saying (perhaps I just wasn't clear). If readUTFBytes returns N characters (not bytes), then you know that the string of bytes that encoded that those characters is at least N bytes long before you reach a '\x00'. I.e. if you call readUTFBytes(100) and you get back 60 characters, then the '\x00' (if there is one at all) has to be somewhere in byte 60-99 and can't be in byte 0-59 so there is no reason to start scanning at byte 0.

So if you have random UTF8 data (that uses the complete encoding space) then you will on average skip searching 40% of the buffer for the '\x00' (on average 100 bytes would become 40 characters). Chances are though that the data is not randomly encoded (i.e. most data is probably in the low end), so you would probably do even better than 50% less scanning for '\x00'.

BonsaiDen commented 13 years ago

I still see a problem due to the multi bytes, how would you prevent the whole thing from reading some bytes twice? If you skip the first 60 bytes, you may still be within the first 60 characters, if you now continue to read from byte 60 onwards you'll end up reading stuff that's already in your string. Then you'll hit the 0x00 again, do the next skip of bytes and read again some stuff that's already in your buffer.

What I'm trying to make clear is, that without checking how many bytes you've actually read, you can't do the skip, because there's no way of preventing the whole thing from reading things twice.

kanaka commented 13 years ago

Actually the average efficiency is probably closer to 25% for uniformly distributed code points.

kanaka commented 13 years ago

That would only be true if a '\x00' in UTF8 could encode multiple ways. But a '\x00' is never part of a multibyte sequence, it's always 1 to 1. The smallest UTF8 encoding of 60 characters is 60 characters (i.e. all code-points are less than 128). The largest encoding would be 240 characters (4 bytes per character). If readUTFBytes returns 60 characters then you know that there can't be a '\x00' in the first 60 bytes because it would have been treated as a string terminator by readUTFBytes and it would have returned LESS than 60 characters.

kanaka commented 13 years ago

Also, you might be confusing the scan with the read. You only ever call readUTFBytes from the beginning, or right after a '\x00'. If you scan to the end and don't find a '\x00' then you know the readUTFBytes call decoded the whole buffer and you only read once. If you find a '\x00' then you know readUTFBytes stopped there, so you still need to continue reading after the '\x00'.

kanaka commented 13 years ago

BTW, another way around this problem is to use "Modified UTF-8" which encodes 0 as '\xc0' '\x80'. http://en.wikipedia.org/wiki/UTF-8#Modified_UTF-8

But it's not technically legal UTF-8 (even though it's used in a bunch of places) and would require the server to comply.

BonsaiDen commented 13 years ago

OK guess I got it now, having a Skype call with 3 other people while discussing technical things isn't thaaaat good, anyways, I've implemented something like that in the Air Version of my socket client, which also has way less loops than the current flash implementation:

socket.addEventListener(air.ProgressEvent.SOCKET_DATA, function(e) {
    socket.readBytes(buffer, buffer.length, 0);
    var len = buffer.length;

    // Check the packets
    var start = -1;
    var end = -1;
    var data = ''; 
    for(var i = 0; i < len; i++) {

        // uhh...
        if (header < 4) {
            if ((header === 0 || header === 2) && buffer[i] === 0x0d) {
                ++header;

            } else if ((header === 1 || header === 3) && buffer[i] === 0x0a) {
                ++header;

            } else {
                header = 0;
            }
            if (header === 4) {
                buffer.position = i + 1;
                that.onopen();
                header = 5;
            }
        }

        if (header === 5) {
            if (buffer[i] === 0xff) {
                that.onmessage({'data': data + buffer.readUTFBytes(i - buffer.position)});
                buffer.position = i + 1;
                data = '';
                start = -1;
                end = i;

            } else if (buffer[i] === 0x00) {
                if (start === -1) {
                    start = i;

                    // do the initial read
                    buffer.position = i + 1;
                    data = buffer.readUTFBytes((i + 1) - buffer.position);
                    i += data.length;

                } else {
                    // this should add the 0x00 all the time
                    data += buffer.readUTFBytes(i - buffer.position) + '\x00';
                }
                buffer.position = i + 1;
            }
        }
    }
    if (end !== -1) {
        truncateBuffer(end + 1);
    }
});

Seems to work fine after playing my game for a couple of minutes.

BonsaiDen commented 13 years ago

OK, so here's a new version which has way less overhead and lag.

edit Did cleanups and added another check to prevent unnecessary reads:
http://github.com/BonsaiDen/web-socket-js/commit/4a81da3d97d98062575ade96964b3c8dbf4d56f7

gimite commented 13 years ago

Did you check if it actually makes performance better? At a glance I'm not sure what causes performance improvement.

BonsaiDen commented 13 years ago

Well I did some of the infamous new Date().getTime() tests, the yielded 0-1ms for both versions after I disabled the calls to main.log(which obviously create overhead due to unnecessary string concatenations). It will certainly do less reads on data that doesn't contain 0x00 it's also way more responsive now(at least in Opera) so I guess there should be some improvement. We'll have to wait what kanaka can report about his VNC client, since I'm only sending relatively small amounts of data.

So overall:
Lag reduced yes
Reads/loops reduced yes
Performance improved maybe

BonsaiDen commented 13 years ago

OK there you go, did some performance tests. And, well, what should I say? We know as much as before...

Echo script which measured the roundtrip on localhost.
Send to the client were 512 messages(one message every 50ms) with 2048 bytes and a preceeding ID each.
Send to the server were the only IDs.

Below are the best results of 3 successive runs per version.

Current version on master:
without \x00: ~33.2ms
with \x00 every 256th byte: ~35.4ms

My version:
without \x00: 36.3ms
with \x00 every 256th byte: 36.2ms

Native WebSocket:
without \x00: 3.5-3.7ms
with \x00 every 256th byte: 2.8-3.4ms

So, yeah, we can cleary see that the native implementation is way faster, go figure... but besides that, even though my version from above should do less reads on data that includes \x00, it's apparently slower. Even though only by a tiny amount.

I suspect that the real performance bottle neck is in the message passing between AS and JS, gonna have to do some investigations on that.

gimite / web-socket-js

\x00 in messages is not correctly handled #39