minijackson / paddle

A library simplifying LDAP usage in Elixir projects
MIT License
53 stars 13 forks source link

Connection Timeout #11

Open sgeos opened 6 years ago

sgeos commented 6 years ago

The problem may be on our end, but Paddle's LDAP connection appears to be timing out. The application works for a while, but the connection seems to go bad after a while. What is the best way to reestablish the connection when this happens?

14:38:45.708 request_id=10arplb3dffjiu9pcleek143qaptqve1 [info] POST /authenticate
14:38:47.748 [error] #PID<0.1971.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
14:38:48.962 [error] #PID<0.1972.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
14:38:50.712 [error] #PID<0.1973.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
14:38:55.092 request_id=8l0msrlt26jl7t3fo1tulf4k8f1nl4at [info] POST /authenticate
14:39:00.099 [error] #PID<0.1974.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
14:39:01.802 request_id=nq8cd2l5jsch6nsu37bqmmjiop7s3qcg [info] POST /authenticate
14:39:06.807 [error] #PID<0.1975.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
14:39:08.415 request_id=vfo18ec9f30tqi2aaa8dj6pucobu9h0c [info] POST /authenticate
14:39:10.238 request_id=33hoql67t1qvak09t2iap446keutnog0 [info] POST /authenticate
14:39:13.423 [error] #PID<0.1976.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
14:39:15.245 [error] #PID<0.1977.0> running MyApp.Endpoint terminated
Server: myproject-api:8080 (http)
Request: POST /authenticate
** (exit) exited in: GenServer.call(Paddle, {:authenticate, 'uid=user,ou=People,dc=domain,dc=com', 'user'}, 5000)
   ** (EXIT) time out
minijackson commented 6 years ago

As a temporary workaround, you could just restart the OTP application by doing:

Application.stop(:paddle)
Application.start(:paddle)

but I'm wondering if there are a better way of handling this, like adding a reconnect function, try returning {:error, :timeout}. If you have any ideas, let me know!

minijackson commented 6 years ago

I just came across something strange: the default value of the :timeout option from Erlang's eldap module (which this library is based upon) is :infinity (see here).

So it might be that the timeout is not coming from the client / server connection.

Could you give more info about your issue ? (debug logs, etc.)

sgeos commented 6 years ago

We enabled debug logging for our production build. I will post them in the future.

We changed the supervision strategy to :one_for_all but it did not solve the problem. If anything is dying, I suspect it is restarted by the paddle application without reconnecting. A die on timeout or reconnect on timeout setting will probably solve our problem. Alternatively, perhaps :die, :reconnect and :restart could be options for a timeout setting, where the current :restart strategy is the default.

sgeos commented 6 years ago

I'm not seeing anything in the log that is worth reporting. Is there a way to turn logging on for paddle that might be useful? For now we are restarting the whole containerized application every 30 minutes. It is a dirty fix, but it seems to be working.

minijackson commented 6 years ago

I have found a way to get the :eldap module to output some log using Logger. I'm not sure if this will be very useful but we can still try.

The only way to crank the logging up for Paddle is to configure Logger to output the :debug level (which will now give us more info about what :eldap is doing).

I'll make a minor release in a short time.

shamanime commented 6 years ago

This is also addressed on pull #21