nats-io / stan.go

NATS Streaming System
https://nats.io
Apache License 2.0
706 stars 117 forks source link

[FIXED] Reduce use of new timer for publish calls #297

Closed kozlovic closed 4 years ago

kozlovic commented 4 years ago

Use a long-lived go routine that deals with publish timeouts. After the low level publish, we used to create a new timer for the timeout of the publish call. This is replaced by adding the "ack" to the list of publish ack waiting to be timed-out. If an ack is processed from the server, the ack is removed from the map and the list.

There was also an issue that on connection close we would "fail" pending publish acks, but invoke the callback from the connection close, which would have caused a deadlock if user invoke anything in the ack handler that requires the connection lock. Solve that by letting the publish ack timeout routine do that job. We could argue that on connection close we may not even want to do anything with those pending acks.

Also fixed a bunch of tests.

Resolves #294

Signed-off-by: Ivan Kozlovic ivan@synadia.com

kozlovic commented 4 years ago

What caused the desire to fix?

The fact that you assigned #294 to me :-)

It is true that creating a timer per publish ack is a lot, but it is also true that this would be limited to the MaxPubAcksInflight, so it is not really how "fast" we think a publisher can go, say 100,000 msgs/sec, it would still be likely 1024 "inflight" timers (1024 being the default MaxPubAcksInflight). So arguably we could do without this change. One thing that I would still have to keep (aside from the fixes to the tests) is the fact that we can't be invoking the pub ack's handlers from the connection close, so that part would need to be fixed even if the timer fix is rolled back.

derekcollison commented 4 years ago

ok gotcha.

kozlovic commented 4 years ago

Closing for now.. We think that the number of timers should not be that high because their number will be limited by MaxPubAcksInflight. Also will re-evaluate with Go 1.14.