HCA-Healthcare / elixir-mllp

An Elixir library for transporting HL7 messages via MLLP (Minimal Lower Layer Protocol)
Apache License 2.0
41 stars 19 forks source link

Smarter reconnect behaviour for MLLP.Sender #14

Closed starbelly closed 2 years ago

starbelly commented 3 years ago

This PR intends to add "smarter" reconnect behavior for MLLP.Sender. Specifically, as it stands today whenever there is an error on send or recv we simply attempt to reconnect. One problem as a result of the current behavior is multiple connections will be established (i.e., the current socket is still open, but we end up opening another one in the case of a server that multiplexes).

This PR also aims to provide better error messages to the caller as well.

starbelly commented 3 years ago

Note: Currently tests fail and I have added no new tests yet. I figured we may want to go over the changes themselves a bit first.

vikas15bhardwaj commented 2 years ago

Few observation as I did some digging into how gen_tcp behaves related to connections:

starbelly commented 2 years ago

Few observation as I did some digging into how gen_tcp behaves related to connections:

* Every time we call `gen_tcp.connect` we get a new socket (which is of type `port`) if connection is successful.

* Socket is basically a Port and port always remain open unless we call `gen_tcp.close` even if other side has closed the connection.

* On every reconnect attempt, we are basically leaving old port open currently (if any was opened previously successfully)

* On a port, we can check Port.info to find if port is still open or not.

Yup, but I think I'm going to do an about face on checking the socket with Process.info, I think that may be too defensive, what I have now and will push up in a few works. Basically, if we end up in that handle_info/2 callback it should be because it's correct that we end up there, thus avoiding the defensive check.

I do agree, if we have an existing socket still, we should close it first (or attempt to), but let me fiddle a bit.

starbelly commented 2 years ago

@vikas15bhardwaj I have put this in a ready to review status. I think there's quite a few other changes we should make to improve reconnect error, but this solves regressions observed in integration testing.

I think other tickets should be opened to :

  1. Make auto-reconnect optional

  2. Add backoff for reconnect (reconnecting over and over without backoff can be harmful)

Those two things at the very least. But as stated, what's in this PR resolves a problem and is probably best limited to that.

vikas15bhardwaj commented 2 years ago

In mix test, we are getting an error logged. Can you check this and may be do capture_log to avoid this showing up in console if required ..... 09:36:22.723 [error] The DefaultPacketFramer is discarding unexpected data: ..... ..........................................................................