TooTallNate / Java-WebSocket

A barebones WebSocket client and server implementation written in 100% Java.
http://tootallnate.github.io/Java-WebSocket
MIT License
10.49k stars 2.58k forks source link

String decoding problem #566

Closed sonwh98 closed 7 years ago

sonwh98 commented 7 years ago

I am using https://github.com/cognitect/transit-java to send json encoded messages between server and client.

On the server I'm using Clojure transit to encoded the list ["a" "b"] which gives the string "[\"a\",\"b\"]" with a hashCode of -939944225

On the client in the WebSocketClient.onMessage I get a message="[\"a\",\"b\"]" with a hashCode of -120497949. The string when printed to the console looks identical to whats encoded but it has a different hashCode which is making decoding transit message to fail.

I created a custom built jar from master because using the jar from maven central failed with a handshake error

marci4 commented 7 years ago

Hello @sonwh98,

thx for your bug report! I feel that this issue is not related to the project itself but to the transit-java project. The reason for this is, that the encoding is there not specified (see this open issue). For the websocket protocol it is specified that if you are sending a TextFrame, it's payload has to be encoded in utf-8.

What was the handshake error?

Greetings marci4 This is, what will be the

sonwh98 commented 7 years ago

thanks for responding. I will report the handshake error in a different issue

transit is UTF-8 encoded. I have no problem decoding it from the javascript side in a web browser.

in the onMessage callback the hashCode of message:"[\"a\",\"b\"]" is -120497949. However, in the same code block doing "[\"a\",\"b\"]".hashCode() gives -939944225 as it does on the clojure side on the server. This tells me that the hashCode -939944225 is correct

https://github.com/TooTallNate/Java-WebSocket/blob/84fc70e7d66b1720b691bdff568c010a7a7183fb/src/main/java/org/java_websocket/util/Charsetfunctions.java#L87 seems more complicated than it needs to be. it can be replaced with a oneliner new String(bytes.array(), "UTF-8");

marci4 commented 7 years ago

Hello @sonwh98,

well, it could be fine on for the browser since you don't compare the string on the hashCode level.

So you are saying the hashCode changes after a hashCode() call?

Thx for your input. That is right now not possible due to the 1.5 java version support (keeping the android support in mind)

Greetings marci4

sonwh98 commented 7 years ago

@marci4 I think I know whats going on, but I do not know if this is the proper behavior.

encoding a vector in transit ["a"] produces a string which has 5 character and a hasCode of 85147669

[
"
a
"
]

in the onMessage the value of the message is the string which has 9 characters and a hasCode of -530250003. The characters in that string is:

"
[
\
"
a
\
"
]
"

it seems the string in the message has 2 extra " and 2 extra \ characters. is this the expected behavior? This is not the behavior I am expecting. I should get the original string that was sent from the server

sonwh98 commented 7 years ago

@marci4 ok i think this is a problem with the serverside clojure websocket lib i'm using https://github.com/jarohen/chord I sent a raw clojure vector without encoding it with transit and got the expected string on the kotlin side. This shows that chord websocket library already does some kind of encoding before sending it to the client. It works on clojurescript in the browser only because i'm using the chord clientside lib that somehow knows how to decode. Sorry to bother you! the fix was to configure chord to format as string instead of EDN https://github.com/sonwh98/wocket/commit/4384e898dbe60a99e0d7da50e6b5c7d12e630c58