zeromq / pyzmq

PyZMQ: Python bindings for zeromq
http://zguide.zeromq.org/py:all
BSD 3-Clause "New" or "Revised" License
3.62k stars 635 forks source link

BUG: Too many resets on loopback after socket.connect() call #1963

Closed aziz142010 closed 4 months ago

aziz142010 commented 4 months ago

This is a pyzmq bug

What pyzmq version?

25.1.2

What libzmq version?

4.3.4

Python version (and how it was installed)

Python3.9

OS

Debian 9/10

What happened?

I have a working code which uses pyzmq pub/sub sockets. However on tcpdump I see too many RST packets. Is it how the connect call is supposed to operate or there is something wrong with the way the code is written ? I have tried both sync and async versions of the code.

Pasting the link to the packet capture https://www.dropbox.com/scl/fi/s3zs0abj2lo6qcbsj48df/connect_dump.pcap?rlkey=9cck3z5bmmigy9vktrby5pcfo&dl=0

Code to reproduce bug

import asyncio
import zmq
#import zmq.asyncio
import pdb
import time

#async def test(sub_port: str):
def test(sub_port: str):
    #context: zmq.asyncio.Context = zmq.asyncio.Context.instance()
    context: zmq.Context = zmq.Context.instance()
    #sub_socket: zmq.asyncio.Socket = context.socket(zmq.SUB)
    sub_socket: zmq.Socket = context.socket(zmq.SUB)
    #sock_poller = zmq.asyncio.Poller()
    sock_poller = zmq.Poller()
    sock_poller.register(sub_socket, zmq.POLLIN)
    pdb.set_trace()
    sub_socket.connect("tcp://127.0.0.1:" + sub_port)
    time.sleep(5)

def main(sub_port: str):
    #asyncio.run(test(sub_port))
    test(sub_port)

main("5800")

Traceback, if applicable

CLIM(%DEFAULT):/home/User# tcpdump -i lo -s 0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
09:56:15.668568 IP localhost.60556 > localhost.5800: Flags [S], seq 1654471656, win 44000, options [mss 65495,sackOK,TS val 191445807 ecr 0,nop,wscale 5], length 0
09:56:15.668576 IP localhost.5800 > localhost.60556: Flags [R.], seq 0, ack 1654471657, win 0, length 0
09:56:15.851822 IP localhost.60560 > localhost.5800: Flags [S], seq 2842959151, win 44000, options [mss 65495,sackOK,TS val 191445990 ecr 0,nop,wscale 5], length 0
09:56:15.851827 IP localhost.5800 > localhost.60560: Flags [R.], seq 0, ack 2842959152, win 0, length 0
09:56:15.976993 IP localhost.60564 > localhost.5800: Flags [S], seq 3871357153, win 44000, options [mss 65495,sackOK,TS val 191446115 ecr 0,nop,wscale 5], length 0
09:56:15.976998 IP localhost.5800 > localhost.60564: Flags [R.], seq 0, ack 3871357154, win 0, length 0
09:56:16.118178 IP localhost.60576 > localhost.5800: Flags [S], seq 2045033265, win 44000, options [mss 65495,sackOK,TS val 191446256 ecr 0,nop,wscale 5], length 0
09:56:16.118183 IP localhost.5800 > localhost.60576: Flags [R.], seq 0, ack 2045033266, win 0, length 0
09:56:16.304412 IP localhost.60590 > localhost.5800: Flags [S], seq 4030675684, win 44000, options [mss 65495,sackOK,TS val 191446442 ecr 0,nop,wscale 5], length 0
09:56:16.304417 IP localhost.5800 > localhost.60590: Flags [R.], seq 0, ack 4030675685, win 0, length 0
09:56:16.501659 IP localhost.60598 > localhost.5800: Flags [S], seq 222390945, win 44000, options [mss 65495,sackOK,TS val 191446639 ecr 0,nop,wscale 5], length 0
09:56:16.501664 IP localhost.5800 > localhost.60598: Flags [R.], seq 0, ack 222390946, win 0, length 0
09:56:16.611817 IP localhost.60600 > localhost.5800: Flags [S], seq 2533795157, win 44000, options [mss 65495,sackOK,TS val 191446750 ecr 0,nop,wscale 5], length 0
09:56:16.611821 IP localhost.5800 > localhost.60600: Flags [R.], seq 0, ack 2533795158, win 0, length 0
09:56:16.750000 IP localhost.60606 > localhost.5800: Flags [S], seq 3494071277, win 44000, options [mss 65495,sackOK,TS val 191446888 ecr 0,nop,wscale 5], length 0
09:56:16.750004 IP localhost.5800 > localhost.60606: Flags [R.], seq 0, ack 3494071278, win 0, length 0

More info

No response

minrk commented 4 months ago

I think this is a question for libzmq, pyzmq doesn't affect anything about tcp behavior of libzmq sockets. I don't see anything wrong with your code off the bat. But one libzmq connect may result in repeated attempts to connect at the transport level if the peers disconnect or aren't available yet when connect starts or anything like that.

aziz142010 commented 4 months ago

Thank you for looking into it. I see these disconnects even after the connection is established with the peer. That is something which is bothering me.