Clarifying the error behavior of the Erlang client

Hi,

The Erlang client needs some clarification on what you should do once error conditions happens. At least as documentation. The problem is that the responsibility split between the Client code and your application is not documented.

Typical error conditions:

One process opens a connection. Another process opens a channel on that connection and begins queue consumption. The other process dies and gets restarted. Now, there is a nonexisting consumer on the channel who gets forwarded messages, which can never be acked. Workaround: process_flag(trap_exit, true) and handle closing down gracefully yourself.
If the channel process dies, you can end up in situations where the channel manager tries to forward messages to a non-existing channel. Workaround: process_flag(trap_exit, true) as above.

What I would like is some kind of documentation which describes the error behavior of the client. If an error occurs in my code, what is my responsibility w.r.t. cleanup and reaching some well-known invariant state? If an error occurs in the RabbitMQ cilent code, how can I monitor this situation and close down appropriate parts of my supervisor tree? Just documenting the current status quo goes a long way to avoid problems for people. And I have a feeling it will also help the internals of RabbitMQ in the long run.

The vision, I think, is to make the client behave as typical Erlang resources. Ports, ETS tables, and so on has a owning/controlling process, and their life-cycle is defined in terms of that process. When it closes, the resource is either torn down, or handed over to an heir. So if a process has opened a channel, Ch, then if that process goes away, then so should the channel. This would simplify a lot of error handling code in the applications using RabbitMQ as they don't have to trap exits and then handle the close-down themselves. Rather, you can simply tear apart the supervisor tree and have it clean up.

I can cook up some concrete examples if needed for the above cases, but I think documentation would go a long way to make more robust applications. If I know what I have to handle myself, I can address the problem. But error-paths are absent from the documentation, so I can't handle these cases and be sure I got everything nailed down.

rabbitmq / rabbitmq-server

Clarifying the error behavior of the Erlang client #2647