heartbeat thread commented in code

gst commented 9 years ago

Hi,

I have another question ;)

why is commented the code to initiate/start the heartbeat thread, it's in https://github.com/veegee/amqpy/blob/master/amqpy/connection.py#L171

we have uncommented that on our side and apparently it's working correctly..

regards,

veegee commented 9 years ago

In the past, there were some issues with synchronization and lock contention while that thread was running, so I commented it out and added a single-threaded automatic heartbeat mechanism. Connection.drain_events() automatically sends heartbeats appropriately while blocking, so a separate thread is no longer necessary.

In addition, I just uploaded another build to PyPI with a critical fix related to heartbeat timeouts. Continuing to fix further bugs today.

gst commented 9 years ago

Thanks à lot.

gst commented 9 years ago

Connection.drain_events() automatically sends heartbeats appropriately while blocking

ah but this means we have to recurrently call drain_events() explicitely ourself and so we can't just keep a reference to an open connection and have it "automagically" handle the heartbeat ??

gst commented 9 years ago

calling is_alive() on the connection also send an heartbeat so it could be enough to just poll this function regularly .. (at least at the interval declared for the heartbeat timeout) ?

Can you confirm it is safe to call this is_alive() from one (dedicated) thread while doing others things (publish/drain_events) with the same connection from other(s) thread(s) ? (given the _frame_write_lock).

While I'm at it : is it also well safe to have multiple threads use the same connection concurrently (to publish on the same exchange using the same channel) .. ??

thanks again.

gst commented 9 years ago

In the past, there were some issues with synchronization and lock contention while that thread was running

are you sure about that ? unless the choosen heartbeat timeout would be very very low (< 0.1 secs, or even lower but that's not sane heartbeat timeout ; usually few seconds (say 5-10) is enough for most of, for instance, loadbalancers default configuration value for their "keep-session" timeouts) I don't see why there would be such lock contention after all..

because as I say : We have some amqpy connection, that could stay long before be used at all, that's why we had uncommented the thread beat code on our side and use it (successfully as far as we see).

because otherwise, we would have to do that (call drain_events(), or send_heartbeat()) ourself explicitely with a dedicated thread (or some callback in an event loop) ; which is in fact the same as having that heartbeat thread handled at the connection level..

veegee commented 9 years ago

You can just call send_heartbeat() to send a heartbeat directly.
You can start a separate thread manually to send it at regular intervals.
The library is thread safe, but the protocol and behaviour is designed to operate with one connection per thread, or one channel per thread with a single global connection. Do not use multiple threads on the same channel, it ~~may not work how you expect~~ will cause corruption.

veegee commented 9 years ago

The contention was likely caused by another bug that I wanted to isolate. It may be perfectly safe to use the heartbeat thread again. In production, I'm just using a heartbeat interval of 60s and a single thread and never had issues as long as nothing blocks for more than the heartbeat interval.

gst commented 9 years ago

ok, thanks for complementary info :)

we could re-enable the feature here so you agree ?

gst commented 9 years ago

https://github.com/veegee/amqpy/pull/17

veegee / amqpy

heartbeat thread commented in code #12