samsondav / rihanna

Rihanna is a high performance postgres-backed job queue for Elixir
MIT License
439 stars 47 forks source link

Fix processes linkage #14

Closed pdawczak closed 6 years ago

pdawczak commented 6 years ago

_The diff of changes is a bit unfortunate, but the changes here are really simple - the test highlighting the problem has been added, andPostgrex.start_link/1 has been moved to init/1._

What

As per comment on the Postgrex initialisation, it's the Postgrex crashing should take instance of JobDispatcher down.

Problem

JobDispatcher.start_link/2 is only an "interface function", that is executed in contex of a caller, thus, in case of test case introduced - Poostgrex.start_link/1 will link the Postgrex session to the test.

This will result with wrong linking - JobDispatcher won't crash, as the message of Posgrex going down will be delivered to test processe's mailbox.

Fix

Moving initialisation to GenServers init/1 will result in executing the code in context of the process, effectively linking them both properly.

Additional considerations

  1. Please note, when running the test it was failing with error:

    21:59:25.705 [error] GenServer #PID<0.197.0> terminating
    ** (stop) exited in: :gen_server.call(#PID<0.196.0>, {:checkout, 
    #Reference<0.400084479.2846883841.4701>, true, 15000}, 5000)
      ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
      (db_connection) lib/db_connection/connection.ex:54: DBConnection.Connection.checkout/2
    [...]

    which was implicitly highlighting the problem - if I understand correctly, the code failed, because Postgrex didn't checkout the connection back properly to JobDispatcher. This proofs relation of proper exit of Postgrex (as it failed to checkout connection), and JobDispatcher remaining alive (as it raised exception of checkout timeout).

  2. I was unable to come up with better idea of nicer way of obtaining the Postgrexs pid to explicitly kill it in the test. I had to use :sys.get_state/1 to introspect internals of up-and-running instance of JobDispatcher. This might be considered fragile (as changing internals of JobDispatcher might break the test), but I couldn't think of better way of going around it. Maybe the test can be removed? It's proved it's value exposing problem, I can't think of a way of highlighting future regression.

samsondav commented 6 years ago

Nice! Thanks man 💙 💛 💚 ❤️ 💜