zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.67k stars 2.35k forks source link

Too many resets on loopback after socket.connect() call #4666

Closed aziz142010 closed 6 months ago

aziz142010 commented 6 months ago

Issue description

I have a working code which uses pyzmq pub/sub sockets. However on tcpdump I see too many RST packets. Is it how the connect call is supposed to operate or there is something wrong with the way the code is written ? I have tried both sync and async versions of the code.

Pasting the link to the packet capture https://www.dropbox.com/scl/fi/s3zs0abj2lo6qcbsj48df/connect_dump.pcap?rlkey=9cck3z5bmmigy9vktrby5pcfo&dl=0

Environment

Minimal test code / Steps to reproduce the issue

import asyncio
import zmq
#import zmq.asyncio
import pdb
import time

#async def test(sub_port: str):
def test(sub_port: str):
    #context: zmq.asyncio.Context = zmq.asyncio.Context.instance()
    context: zmq.Context = zmq.Context.instance()
    #sub_socket: zmq.asyncio.Socket = context.socket(zmq.SUB)
    sub_socket: zmq.Socket = context.socket(zmq.SUB)
    #sock_poller = zmq.asyncio.Poller()
    sock_poller = zmq.Poller()
    sock_poller.register(sub_socket, zmq.POLLIN)
    pdb.set_trace()
    sub_socket.connect("tcp://127.0.0.1:" + sub_port)
    time.sleep(5)

def main(sub_port: str):
    #asyncio.run(test(sub_port))
    test(sub_port)

main("5800")

What's the actual result? (include assertion message & call stack if applicable)

CLIM(%DEFAULT):/home/User# tcpdump -i lo -s 0 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes 09:56:15.668568 IP localhost.60556 > localhost.5800: Flags [S], seq 1654471656, win 44000, options [mss 65495,sackOK,TS val 191445807 ecr 0,nop,wscale 5], length 0 09:56:15.668576 IP localhost.5800 > localhost.60556: Flags [R.], seq 0, ack 1654471657, win 0, length 0 09:56:15.851822 IP localhost.60560 > localhost.5800: Flags [S], seq 2842959151, win 44000, options [mss 65495,sackOK,TS val 191445990 ecr 0,nop,wscale 5], length 0 09:56:15.851827 IP localhost.5800 > localhost.60560: Flags [R.], seq 0, ack 2842959152, win 0, length 0 09:56:15.976993 IP localhost.60564 > localhost.5800: Flags [S], seq 3871357153, win 44000, options [mss 65495,sackOK,TS val 191446115 ecr 0,nop,wscale 5], length 0 09:56:15.976998 IP localhost.5800 > localhost.60564: Flags [R.], seq 0, ack 3871357154, win 0, length 0 09:56:16.118178 IP localhost.60576 > localhost.5800: Flags [S], seq 2045033265, win 44000, options [mss 65495,sackOK,TS val 191446256 ecr 0,nop,wscale 5], length 0 09:56:16.118183 IP localhost.5800 > localhost.60576: Flags [R.], seq 0, ack 2045033266, win 0, length 0 09:56:16.304412 IP localhost.60590 > localhost.5800: Flags [S], seq 4030675684, win 44000, options [mss 65495,sackOK,TS val 191446442 ecr 0,nop,wscale 5], length 0 09:56:16.304417 IP localhost.5800 > localhost.60590: Flags [R.], seq 0, ack 4030675685, win 0, length 0 09:56:16.501659 IP localhost.60598 > localhost.5800: Flags [S], seq 222390945, win 44000, options [mss 65495,sackOK,TS val 191446639 ecr 0,nop,wscale 5], length 0 09:56:16.501664 IP localhost.5800 > localhost.60598: Flags [R.], seq 0, ack 222390946, win 0, length 0 09:56:16.611817 IP localhost.60600 > localhost.5800: Flags [S], seq 2533795157, win 44000, options [mss 65495,sackOK,TS val 191446750 ecr 0,nop,wscale 5], length 0 09:56:16.611821 IP localhost.5800 > localhost.60600: Flags [R.], seq 0, ack 2533795158, win 0, length 0 09:56:16.750000 IP localhost.60606 > localhost.5800: Flags [S], seq 3494071277, win 44000, options [mss 65495,sackOK,TS val 191446888 ecr 0,nop,wscale 5], length 0 09:56:16.750004 IP localhost.5800 > localhost.60606: Flags [R.], seq 0, ack 3494071278, win 0, length 0

What's the expected result?

I don't expect to see too many RST tcp packets.

aziz142010 commented 6 months ago

For those who come looking for the similar problem, this is a result of calling connect() on a socket while the other listener has already closed the connection.