CityBaseInc / airbrake_client

Airbrake client to report errors and exceptions to Airbrake.io.
Other
0 stars 1 forks source link

Handle errors from HTTP requests #22

Open jdfrens opened 3 years ago

jdfrens commented 3 years ago

I and another developer have had problems testing Airbrake.report/2 today. Unfortunately, Airbrake.report/2 always returns :ok with no output or logging, even if the HTTPoison.post/3 fails.

The situation

We make the HTTP POST in Airbrake.Worker.send_report/3 (a private function):

https://github.com/CityBaseInc/airbrake_client/blob/d81d95c27db0ba3c8a71fb7396c91f9a5d4e5a44/lib/airbrake/worker.ex#L114

The value returned here is never used or pattern-matched against.

I filled in the necessary values and ran the equivalent command from the iex console, and I got {:error, %HTTPoison.Error{id: nil, reason: :timeout}} as a response sometimes; other times I would get a 200 OK success.

I tried setting timeout options: [timeout: 50_000, recv_timeout: 50_000]. Now the requests always go through, but I'm getting a lot of 500s (with a giant "Internal Server" HTML page as the body). So ultimately it seems that there is a problem on the Airbrake.io side of things.

So an internal server error from Aibrake.io is bad, but not under our control. This library should react to a problem like this better that ignoring it.

Options

Right now, airbrake_client swallows these errors completely. This is not ideal; it would be nice if there were some record of the problem.

Some options we might like to consider:

  1. Bump up the timeouts.
    • By default they are 8s to make a connection, 5s to wait for a response. Of course, the longer the timeout, the more it could become a bottleneck.
    • This only solves timeout errors. An internal server error from Airbrake.io isn't handled any better with just this solution.
  2. Log the error.
  3. Retry the POST.
    • Try the call three times in the code, maybe with greater timeouts each time (especially with a timeout error).
    • Send the request back to the Airbrake.Worker process.
    • Queue up requests to send later.

1 and 2 have simple solutions that could be effective. All of the solutions to 3 raises the complexity of the library significantly, and when a solution breaks, it'll break even harder than what we have now (thrashing processes and supervisors).