Azolo / websockex

An Elixir Websocket Client
MIT License
515 stars 97 forks source link

handle_frame swallowing exception? #51

Open pikeas opened 6 years ago

pikeas commented 6 years ago
defmodule Foo.Client do
  use WebSockex
  require Logger

  @name __MODULE__
  @uri "..."

  def child_spec([]) do
    %{
      id: @name,
      start: {@name, :start_link, []},
    }
  end

  def start_link() do
    WebSockex.start_link(@uri, @name, :foo, name: @name)
  end

  def handle_frame({:text, msg}, state) do
    Logger.info msg
    raise "oops"
  end

If something unexpected happens in handle_frame, the exception is completely swallowed - the process crashes but nothing is logged to the console.

Azolo commented 6 years ago

It does catch them so that it can run terminate. Are you manually exiting out of terminate?

pikeas commented 6 years ago

Ah, that's probably it - this module hasn't defined terminate.

https://github.com/Azolo/websockex/blob/cb615af5efbe59d1f13e2574d982d5d0a6e6beb4/lib/websockex.ex#L304

I don't see any error handling here - somewhere between handle_frame and terminate WebSockex is eating the error. How can I get standard Erlang/Elixir behavior, where an unexpected failure loudly crashes the process with a stacktrace?

Azolo commented 6 years ago

That is the expected behavior. If you haven't defined terminate/2 then it should fail loudly.

There's nothing that's written to logger explicitly like :gen_server (:gen_server produces 2 messages) does, but the raised exit message should be a crash message with a stacktrace.

Here's the test for it. https://github.com/Azolo/websockex/blob/cb615af5efbe59d1f13e2574d982d5d0a6e6beb4/test/websockex_test.exs#L779-L786

I'll try to reproduce it in a bit, but since the test passed I'm wondering if it's something else in the way that they should be reported that I am missing.

pikeas commented 6 years ago

Let me know if you're unable to reproduce and I'll put together a minimal broken example.

pikeas commented 6 years ago

There's nothing that's written to logger explicitly like :gen_server (:gen_server produces 2 messages) does, but the raised exit message should be a crash message with a stacktrace.

Could you clarify what this means? The process is definitely crashing, the crash is caught by a supervisor which restarts the process. If the crash is being properly raised (including stacktrace), what's unexpected is the lack of logging.

From my perspective as a consumer of the library, a quiet crash makes debugging extremely difficult and counterintuitive. For example, it took me about an hour to realize I'd made a mistake in how I was using Poison to decode a response in handle_frame, which I would have noticed immediately with a standard stacktrace to console.

Azolo commented 6 years ago

Yes, the supervisor is "eating" the exit and I agree that a logging mechanism is definitely necessary. Normally OTP would receive those exits and create some kind of message. GenServer gets around that limitation by using :error_logger and writing the cause and some information to the console.

The problem for me is that there are some libraries that depend on :error_logger wrapping to flag exits and exceptions so I don't want to just use Logger. When I tried using :error_logger I couldn't capture it to test it, and after a lot of frustration I took a break. I haven't gotten around to trying it again.

In the interim as a workaround, you can do something like:

def terminate({exception, stacktrace}, state) do
  if Exception.exception?(exception) do
    require Logger
    Logger.error fn -> Exception.format(:error, exception, stacktrace) end
  end
  :ok
end
def terminate(_,_), do: :ok

That(#30) and SSL connections(#1) are why it's not 1.0 release yet.

stocks29 commented 5 years ago

This was really tripping me up. I couldn't figure out why a process I had linked to my socket kept getting a new pid. I'd recommend putting this logging in as the default behavior until this can be properly addressed as you desire.