crossbario / autobahn-testsuite

Autobahn WebSocket protocol testsuite
https://crossbar.io/autobahn/
Apache License 2.0
997 stars 84 forks source link

Commit f0c83a3 breaks the 12.4 tests. #77

Closed jcoglan closed 6 years ago

jcoglan commented 7 years ago

https://github.com/crossbario/autobahn-testsuite/commit/f0c83a3d6ad157417b7d56091796d8c8798be9c0 makes some changes to the file testdata/data1.html, which is used in the 12.4 block of tests for compression. I don't yet understand why, but these changes break the tests 12.4.{5,6,8,9,10,11,13,14,15,16,17,18}.

I initially discovered this while maintaining my WebSocket libraries (https://github.com/faye), but I have reproduced the failures with the ws npm package, and with the file examples/twisted/websocket/echo_compressed/server.py in the autobahn-python repo.

The tests work fine with the 0.7.5 release. Taking the 0.7.5 package and removing the s from the end of WebSockets as is done in the above-linked commit is enough to make the test start failing.

oberstet commented 7 years ago

Taking the 0.7.5 package and removing the s from the end of WebSockets as is done in the above-linked commit is enough to make the test start failing.

This is very strange obviously. We have to look into that. It won't have time to get to it soon though ..

One thing you could try is use: https://github.com/crossbario/crossbar-docker/tree/master/autobahn-testsuite

This is a Dockerized Autobahn testsuite (clients only for now). By using Docker, we gain more control about the actual bits and deps. We are using this for our own testing of Autobahn websocket impl.

vinniefalco commented 7 years ago

I am having the same problem. Beast WebSockets (https://github.com/boostorg/beast) passes 0.7.5 but fails 0.7.6.

One thing that I have questions about, 12.4.5 says "Send 1000 compressed messages each of payload size 4096". But how is it possible to break up the file data1.html in perfect chunks of 4096 bytes? What if that splits one of the utf8 code points across messages? That would cause a failure.

And by coincidence, my code fails because of a bad UTF8 code point after decompression. And it happens right at the very end of the message buffer.

It looks like Autobahn|Testsuite is splitting a code point across messages, and causing tests to fail.

aaugustin commented 7 years ago

I'm having the exact same issue with https://github.com/aaugustin/websockets.

My implementation aborts with a 1007 error code, which means "Invalid UTF-8".

This looks very much like #71.

aaugustin commented 7 years ago

In https://github.com/crossbario/autobahn-testsuite/issues/71#issuecomment-294192231 exclude_cases was suggested as a workaround.

vinniefalco commented 7 years ago

exclude_cases is unacceptable because I have to display these reports to the public, and they will appear as "skipped". See: http://vinniefalco.github.io/autobahn/index.html

aaugustin commented 7 years ago

Fun fact about public reports: I believe any implementation that currently displays the 12.5.* test cases as passing is non-compliant — it accepts broken UTF-8 that it should reject :-)

vinniefalco commented 7 years ago

Has anyone put together a patch?

vinniefalco commented 6 years ago

This is still broken.

ghost commented 6 years ago

Chapter 12.4 breaks µWS, Beast, websockets/ws

ghost commented 6 years ago

For me, Autobahn seems to send broken utf-8 which is then properly refused and connection gets terminated

oberstet commented 6 years ago

I believe this is the same as issue #71, eg see the comment https://github.com/crossbario/autobahn-testsuite/issues/71#issuecomment-294191346

The current code is splitting UTF8 test strings/files at non-codepoint boundaries, and the removal of the "s"es in the test file triggers the issue here. If that is all true, this is a really wicked one: the issue was always lurking under the hood, but was actually sleeping until the test data file was changed for an apparantly "typo".

The solution is I think what I already dug out in above comment: steal the respective code from current Autobahn (which isn't in the frozen version the test suite uses) for doing splitting that respects code point boundaries, and use that here for 12.*.

vinniefalco commented 6 years ago

The bug is that Autobahn takes the test corpus (the html file I believe) and breaks it up into pieces to use in individual messages for testing. However, it ends up splitting utf-8 code points. Thus, it performs a test with source data that contains a partial code point. This is consistent with @alexhultman's observation of receiving invalid utf-8.

vinniefalco commented 6 years ago

10 months and still no fix, eh?

aaugustin commented 6 years ago

Tobias — and a few other qualified people — would probably be happy to write a fix if they were getting compensated for that work, which no one offered as far as I know...

vinniefalco commented 6 years ago

if they were getting compensated

I'm willing to entertain the idea, but I'd like to see a quote with a price for the work.

ghost commented 6 years ago

@vinniefalco You sold many XRP in January eh? :grin:

![](https://media1.tenor.com/images/8b86a3cdef11e9af118e76bf42b17d3c/tenor.gif)

oberstet commented 6 years ago

the truth is: we (as a company) have sponsored this effort (the testsuite) a long time, and with a significant amount of resources. currently, all our energy is concentrated on another thing unfortunately

@alexhultman hehe, not quite yet! (I deactivated the gif .. I like it .. but not looking at it all the time;)

https://www.bitmex.com/ looks like you are working in an interesting field;) want to get in touch? XBR ...

vinniefalco commented 6 years ago

@oberstet I appreciate the response. Is there anyone else here who has some free time that might look into this?

ghost commented 6 years ago

@oberstet I don't work at or represent bitmex though

vinniefalco commented 6 years ago

@oberstet A very kind individual on the public Cpplang Slack offered to help, and they are currently running tests to see if the fix has introduced any other problems.

vector-of-bool commented 6 years ago

(Me from Slack) I've talked with @vinniefalco and I think I have a good fix for this. I'll send in a PR when I've verified I haven't broken anything. The gist of it is this:

  1. TESTDATA marks some test payloads as "binary," and we deal with those as raw byte sequences. These cases weren't having trouble because text encoding wasn't important here.
  2. Some tests, such as data1.html deal with "non-binary." In these cases, sendOne() and onMessage() in case12_x_x.py will decode('utf-8') the test payload data before working with it.
jcoglan commented 6 years ago

As the person that opened this issue, and as a fellow open-source maintainer, I'd like to emphasise that shaming people for not getting a fix out within some arbitrary time period is not productive and mostly leads to maintainers burning out.

Also, the value I've got from this software is substantial considering I didn't have to pay for it, and there are many thousands of projects depending on software that's better for having this test suite. If this issue gets fixed, that'll be a nice bonus but I'm happy to work around it.