nsqio / go-nsq

The official Go package for NSQ
MIT License
2.59k stars 444 forks source link

Client doesn't know it's disconnected #344

Open watchforstock opened 2 years ago

watchforstock commented 2 years ago

I have a system where I'm getting log messages from nsqd such as:

INFO: PROTOCOL(V2): [127.0.0.1:53376] exiting ioloop ERROR: client(127.0.0.1:53376) - failed to read command - read tcp 1270.0.1:4150->127.0.0.1:53376: i/o timeout INFO: PROTOCOL(v) [127.0.0.1:53376] exiting messagePump

From looking at the /stats endpoint I can see after this message that the consumer is no longer connected (two producers are still connected fine). I can see that there were 2 messages in timeout which I presume is related to the disconnection of the consumer.

However, there is no indication that the consumer knows that it has lost its connection so there's no attempt to reconnect (or even just notify me that it's stopped). The StopChan isn't being called and the Stats() end point on the consumer is still giving a connection count of 1.

Is this the correct behaviour? Have I missed something around how to detect this i/o timeout situation? Unfortunately I can't share the source code, but happy to try things and report back.

Thanks

Andre

mreiferson commented 2 years ago

Doesn't sound like correct behavior. Can you share the logs from the consumer side? I assume you filed the issue here because the consumer uses go-nsq?

What versions are you running of nsqd and go-nsq?

JillChen099 commented 2 years ago

perhaps linking to Lookupd for consumption will solve the problem.