scoutapp / scout_apm_elixir

ScoutAPM Elixir Agent. Supports Phoenix and other frameworks.
https://scoutapm.com
Other
36 stars 20 forks source link

Scout APM FunctionClauseError after DB error/rollback #42

Closed percygrunwald closed 6 years ago

percygrunwald commented 6 years ago

Environment

Observed behavior

We encountered a situation where our Postgres DB was set to read-only and noticed that Scout APM caused all controllers to raise the following error after the query to create a new session in the DB errored:

[debug] QUERY ERROR db=14.1ms
INSERT INTO "sessions" ("data","device_id","token","inserted_at","updated_at") VALUES ($1,$2,$3,$4,$5) RETURNING "id" [%{"_csrf_token" => "..."}, 3, "...", {{2018, 1, 16}, {2, 43, 19, 194863}}, {{2018, 1, 16}, {2, 43, 19, 194872}}]
[debug] QUERY OK db=0.2ms
rollback []
[error] #PID<0.1817.0> running MyApp.Endpoint terminated
Server: care.myapp.local:4000 (http)
Request: GET /login?alert_message=Please+log+in+to+access+this+resource.&alert_type=warning&redirect_to=http%3A%2F%2Fcare.myapp.local%3A4000%2F
** (exit) an exception was raised:
    ** (FunctionClauseError) no function clause matching in anonymous fn/1 in ScoutApm.TrackedRequest.pop_layer/1
        (scout_apm) lib/scout_apm/tracked_request.ex:181: anonymous fn([]) in ScoutApm.TrackedRequest.pop_layer/1
        (elixir) lib/map.ex:766: Map.get_and_update/3
        (scout_apm) lib/scout_apm/tracked_request.ex:180: ScoutApm.TrackedRequest.pop_layer/1
        (scout_apm) lib/scout_apm/tracked_request.ex:52: ScoutApm.TrackedRequest.stop_layer/2
        (scout_apm) lib/scout_apm/tracked_request.ex:144: ScoutApm.TrackedRequest.with_saved_tracked_request/1
        (scout_apm) lib/scout_apm/plugs/controller_timer.ex:19: ScoutApm.Plugs.ControllerTimer.before_send/1
        (elixir) lib/enum.ex:1829: Enum."-reduce/3-lists^foldl/2-0-"/3
        (plug) lib/plug/conn.ex:1133: Plug.Conn.run_before_send/2
        (plug) lib/plug/conn.ex:393: Plug.Conn.send_resp/1
        (plug) lib/plug/error_handler.ex:94: Plug.ErrorHandler.__catch__/6
        (myapp) lib/myapp/endpoint.ex:1: MyApp.Endpoint.plug_builder_call/2
        (myapp) lib/plug/debugger.ex:99: MyApp.Endpoint."call (overridable 3)"/2
        (myapp) lib/myapp/endpoint.ex:1: MyApp.Endpoint.call/2
        (plug) lib/plug/adapters/cowboy/handler.ex:15: Plug.Adapters.Cowboy.Handler.upgrade/4
        (cowboy) /Users/percy/Code/myapp-server-elixir/deps/cowboy/src/cowboy_protocol.erl:442: :cowboy_protocol.execute/4

If I comment out the inclusion of Scout APM in the controller in web.ex, the following error is raised:

# web/web.ex
...
  def controller do
    quote do
      use Phoenix.Controller
#      use ScoutApm.Instrumentation
...
[debug] QUERY ERROR db=7.8ms
INSERT INTO "sessions" ("data","device_id","token","inserted_at","updated_at") VALUES ($1,$2,$3,$4,$5) RETURNING "id" [%{"_csrf_token" => "..."}, 3, "...", {{2018, 1, 16}, {2, 47, 31, 612493}}, {{2018, 1, 16}, {2, 47, 31, 612500}}]
[debug] QUERY OK db=0.3ms
rollback []
[info] Sent 500 in 26ms
[error] #PID<0.1839.0> running MyApp.Endpoint terminated
Server: care.myapp.local:4000 (http)
Request: GET /login?alert_message=Please+log+in+to+access+this+resource.&alert_type=warning&redirect_to=http%3A%2F%2Fcare.myapp.local%3A4000%2F
** (exit) an exception was raised:
    ** (Postgrex.Error) ERROR 25006 (read_only_sql_transaction): cannot execute INSERT in a read-only transaction
        (ecto) lib/ecto/adapters/sql.ex:554: Ecto.Adapters.SQL.struct/8
        (ecto) lib/ecto/repo/schema.ex:547: Ecto.Repo.Schema.apply/4
        (ecto) lib/ecto/repo/schema.ex:213: anonymous fn/14 in Ecto.Repo.Schema.do_insert/4
        (ecto) lib/ecto/repo/schema.ex:774: anonymous fn/3 in Ecto.Repo.Schema.wrap_in_transaction/6
        (ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
        (db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4
        (db_connection) lib/db_connection.ex:1199: DBConnection.run_begin/3
        (db_connection) lib/db_connection.ex:790: DBConnection.transaction/3
        (myapp) lib/myapp/plug/session/db.ex:71: MyApp.Plug.Session.DB.create_session/2
        (myapp) lib/myapp/plug/session/db.ex:25: MyApp.Plug.Session.DB.put/4
        (myapp) web/plugs/fetch_session_plug.ex:95: MyApp.FetchSessionPlug.put_session/3
        (elixir) lib/enum.ex:1829: Enum."-reduce/3-lists^foldl/2-0-"/3
        (plug) lib/plug/conn.ex:1133: Plug.Conn.run_before_send/2
        (plug) lib/plug/conn.ex:393: Plug.Conn.send_resp/1
        (myapp) web/controllers/auth_controller.ex:1: MyApp.AuthController.action/2
        (myapp) web/controllers/auth_controller.ex:1: MyApp.AuthController.phoenix_controller_pipeline/2
        (myapp) lib/myapp/endpoint.ex:1: MyApp.Endpoint.instrument/4
        (phoenix) lib/phoenix/router.ex:279: Phoenix.Router.__call__/1
        (myapp) lib/plug/error_handler.ex:64: MyApp.Router.call/2
        (myapp) lib/myapp/endpoint.ex:1: MyApp.Endpoint.plug_builder_call/2

Expected behavior

In this case, the inclusion of Scout APM made it harder for us to diagnose the underlying issue because the actual root error was superseded by the Scout error. This is less than ideal, it would be better if Scout had the error handling to handle this case so only the root error remains.

Steps to reproduce

In our case, I was able to reproduce the error by setting my local Postgres database to read-only with the following command:

ALTER DATABASE myapp_dev SET default_transaction_read_only = true;

Our application stores sessions in the database, so there is an INSERT query that gets executed in the session plug before the request hits the controller when a request comes in without an existing session cookie.

To reproduce this issue, it should be sufficient to set the DB to read-only and then try to INSERT something into the DB in a pre-controller plug.

itsderek23 commented 6 years ago

Thanks @pgrunwald for the detailed report.

We'll check it out this week.

/cc @cschneid

itsderek23 commented 6 years ago

@pgrunwald - we're able to reproduce. We'll be looking into a fix.

mitchellhenke commented 6 years ago

@pgrunwald apologies, but while taking another look into this, I wasn't able to reproduce. I've put together a minimal set of changes to try to reproduce here: https://github.com/mitchellhenke/scout_testing/compare/8885e78348938e6d91fc41d64fdb22f5f38954b2...reproduce-42?expand=1

With those changes and after running ALTER DATABASE scout_testing_dev SET default_transaction_read_only = true; I get a stacktrace like:

begin []
[debug] QUERY ERROR db=34.5ms
INSERT INTO "users" ("inserted_at","updated_at") VALUES ($1,$2) RETURNING "id" [{{2018, 1, 26}, {21, 35, 3, 503943}}, {{2018, 1, 26}, {21, 35, 3, 503968}}]
[debug] QUERY OK db=1.6ms
rollback []
[error] #PID<0.655.0> running ScoutTestingWeb.Endpoint terminated
Server: localhost:4000 (http)
Request: GET /
  ** (exit) an exception was raised:
    ** (Postgrex.Error) ERROR 25006 (read_only_sql_transaction): cannot execute INSERT in a read-only transaction
        (ecto) lib/ecto/adapters/sql.ex:554: Ecto.Adapters.SQL.struct/8

Is there anything in my diff that is different than your setup? Any help in tracking this down would be very much appreciated 🙂

percygrunwald commented 6 years ago

@mitchellhenke,

Ah, apologies, I missed an important detail in my initial bug report: I turned on dev_trace in the config. The scout_apm config from which I am able to reproduce the bug on my local is shown below.

# config/dev.exs
...
config :scout_apm,
  dev_trace: true
...
mitchellhenke commented 6 years ago

@pgrunwald

I gave that a go, but unfortunately wasn't able to reproduce.

percygrunwald commented 6 years ago

@mitchellhenke, yeah I forked your scout_testing repo and tried to recreate and it seems I'm not able to. I guess there is something in the more complex setup of the application where I am getting the error that is causing it.

In any case, the problem is caused by pattern matching on an empty list here:

# deps/scout_apm/lib/scout_apm/tracked_request.ex:179
    {cur_layer, s1} =
      Map.get_and_update(s0, :layers,
                         fn [cur | rest] -> {cur, rest} end)

In my application, the argument passed to the anonymous function is [], which does not match [cur | rest]. I would say that even though we don't know how my application is able to get into a state in which the :layers key is an empty list, we do know that it's a possibility and scout_apm should probably have clauses in functions that pattern match on empty lists.