faye / websocket-driver-ruby

WebSocket protocol handler with pluggable I/O
Other
223 stars 43 forks source link

opcode -1 #41

Closed miukki closed 8 years ago

miukki commented 8 years ago

i got on client opcode -1

somehow wss connection is down with opcode: -1

image

image

then it was again connected 17:01:37

image

with opcode :-1 i also got status:finished in Nework panel (chrome)

image

i have some server logs. as well in time when opcode: -1 puma.log

I, [2016-03-28T16:27:36.971471 #23502]  INFO -- omniauth: (salesforce) Request phase initiated.
[23618] + Gemfile in context: /home/deployer/apps/iam_api/releases/20160328082647/Gemfile
[23621] + Gemfile in context: /home/deployer/apps/iam_api/releases/20160328082647/Gemfile
[1331] - Worker 2 (pid: 23618) booted, phase: 4
[1331] - Worker 3 (pid: 23621) booted, phase: 4
I, [2016-03-28T16:44:25.667409 #23621]  INFO -- omniauth: (salesforce) Request phase initiated.
I, [2016-03-28T16:44:39.428691 #23618]  INFO -- omniauth: (salesforce) Callback phase initiated.

looking on this log i have conclusion is server was reboot at 16:38

other log file doest have any logs for time gap 16:30 .. 16:38 .. when i got opcode: -1

61.150.91.55, 120.132.92.14 - - [27/Mar/2016:16:26:51 +0800] "GET / HTTP/1.0" 302 - 0.0008
119.188.112.227, 120.132.92.14 - - [27/Mar/2016:16:41:30 +0800] "GET / HTTP/1.0" 302 - 0.0008

and i have one more question:

does it alright to get a few clientId in one tab during the one user session (because sometimes faye down.. and then up again. or is it abnormal thing), please check screenshot

image

miukki commented 8 years ago

@jcoglan can you help to understand issue , thx

jcoglan commented 8 years ago

I don't know why a browser would ever report that a WebSocket frame has opcode -1. Opcodes are bits of metadata that are part of WebSocket messages, and they consist of 4 bits representing an unsigned integer from 0 to 10. A WebSocket parser interpreting those four bits as -1 is an error.

I would need to see an example program that triggers this behaviour in order to investigate further.

does it alright to get a few clientId in one tab during the one user session (because sometimes faye down.. and then up again. or is it abnormal thing), please check screenshot

Yes. If the client is disconnected for long enough, its session times out and it has to get a new clientId from the server. This is normal reconnection behaviour.

jcoglan commented 8 years ago

@miukki Did you get any further with this issue? I'm going to need some example code that demonstrates the problem in order to make any progress.

k-yamada commented 8 years ago

I also have the same error occurred.

My program was continuously send the image file in the WebSocket. I think the cause of the error is that @rack_hijack_io.write(data) is not synchronized.

environment

monkey patch

When I apply the following monkey patch, this error no longer occurs.

# ext/faye/rack_stream.rb:

module Faye
  class RackStream
    alias_method :__initialize__, :initialize

    def initialize(socket)
      @mutex = Mutex::new
      __initialize__(socket)
    end

    def write(data)
      return @mutex.synchronize { @rack_hijack_io.write(data) } if @rack_hijack_io
      return @stream_send.call(data) if @stream_send
    rescue => e
      fail if EOFError === e
    end
  end
end
jcoglan commented 8 years ago

@k-yamada If you're using faye-websocket than all interaction with the underlying TCP stream should be happening on the same thread, and certainly all interaction with the driver. Can you check whether, in your system, a single socket is being written to by multiple threads?

k-yamada commented 8 years ago

@jcoglan

Can you check whether, in your system, a single socket is being written to by multiple threads?

Yes. In my system, a single socket is being written to by multiple threads.

Thanks.

jcoglan commented 8 years ago

@k-yamada I assume this means you're calling Faye::WebSocket#send from multiple threads. Have you tried making sure these calls are on the EventMachine thread, like this:

EventMachine.next_tick do
  faye_websocket.send(message)
end
k-yamada commented 8 years ago

@jcoglan

I assume this means you're calling Faye::WebSocket#send from multiple threads.

Yes.

Have you tried making sure these calls are on the EventMachine thread, like this:

No, I have not.

I've applied to EventMachine.next_tick in my system. Then, the error no longer occurs. Thanks!

jcoglan commented 8 years ago

Great, I'll close this issue but please leave more comments if this problem surfaces again. The general position of this library is that since it's designed to integrate with any Ruby IO/concurrency framework, it doesn't implement any concurrency stuff itself -- we could put a mutex in, but e.g. EventMachine users won't need that and it only adds performance overhead in that case.

If you're using EM, it's on you to make sure you interact with it on the EM thread, which is what next_tick does.