elixir-ecto / db_connection

Database connection behaviour
http://hexdocs.pm/db_connection/DBConnection.html
306 stars 113 forks source link

Holder.checkout deadlock? #198

Closed tim2CF closed 5 years ago

tim2CF commented 5 years ago

I’m trying to test some cluster features of my app with ex_unit_clustered_case package. Local cluster starts successfully, but when I’m trying to call some Ecto Repo functions in slave node, Repo always returns error

%DBConnection.ConnectionError{
    message: "connection not available and request was dropped from queue after 1333ms. You can configure how long requests wait in the queue using :queue_target and :queue_interval. See DBConnection.start_link/2 for more information"
  }

I started to investigate what is happening on cluster nodes at this moment using observer tool and found interesting processes on top:

Screenshot 2019-06-21 at 12 29 27

Can it be some deadlock in this receive expression? https://github.com/elixir-ecto/db_connection/blob/9d2416a4b0e9f9f465e639e090027e9acc7d9d77/lib/db_connection/holder.ex#L237

Maybe you have an idea how to fix it?

josevalim commented 5 years ago

What is your DBConnection version? In any case, without an isolated mechanism to consistently reproduce the issue I am afraid there isn't much we can do.

tim2CF commented 5 years ago

version 2.1.0

michalmuskala commented 5 years ago

Looking at the screen you pasted, I sort of wonder what is that message stuck in the queue. You should be able to see it with observer.

tim2CF commented 5 years ago

unfortunately looks like these processes are instantly restarting and query queue stuck https://ibb.co/rkzNFQw

tim2CF commented 5 years ago

It was too low max_connections parameter from postgres side... Happened because I reinstalled postgres recently. I just was expecting different more obvious error message for this case 😀