Open gfokkema opened 2 years ago
Thank you for letting me know of this mysterious symptom.
At first glance I can't tell what's going on.. I didn't look into this yet.
By the way, what nc -N
means? My nc does not support -N
option so I don't understand.
Hi there!
nc -N
shuts down the network socket after EOF on stdin (bsd netcat vs gnu netcat or smth).
After the echo
example has missed a reply, as soon as netcat closes the fd,
the EOF and subsequent socket close causes a second message that does trigger the reply,
causing the echo
example to write to the already closed fd, leading to a panic (as was noted in the example too).
The panic is thus to be expected really, and the curious behaviour is why sometimes the first trigger to process the packet is missed until the second trigger comes along (either EOF or a different message such as a
).
Ie, this intermittent behaviour:
gerlof@host:redbpf $ (sleep .0015 ; echo test; sleep 1; echo a) | nc 127.0.0.1 10000
test # <-- received immediately
a # <-- received after 1 second
gerlof@host:redbpf $ (sleep .0015 ; echo test; sleep 1; echo a) | nc 127.0.0.1 10000
test # <-- received after 1 second
a # <-- received after 1 second
gerlof@host:redbpf $ (sleep .0015 ; echo test; sleep 1; echo a) | nc 127.0.0.1 10000
test # <-- received immediately
a # <-- received after 1 second
Thank you for your response!
@gfokkema
I've investigated this issue recently and fix some part of the problem. But I am still looking for a solution for that stream parser is not triggered before the second packet is received.
Solved issues
Root cause of above problems
Unsolved problem
It seems that the data that is already received and stored at the TCP backlog before setting sockmap does not trigger the stream parser. That causes the first packet sent before setting sockmap does not lead to the echo response.
I am wondering there is a way to trigger stream parser with the existing data in the backlog when updating sockmap.
I forgot to attach the link of the PR that solves problems mentioned above. Here it is:
I'll let you know if I find a solution for applying stream parser for the first packet that is received before sockmap setting.
@gfokkema
I found a workaround to trigger stream parser manually.
Calling setsockopt(SO_RCVLOWAT)
right after setting sockmap solves the
problem. It makes the kernel check whether tcp received data is ready so that
the packets already received before setting sockmap are processed immediately.
This PR makes the first packet is echoed instantly. I hope this change solves the problem you had struggled with. Thanks,
Thank you very much for your efforts, very much appreciated!
My apologies for the late reply (student life), I'll make sure to soon find some time, test your improvements and post an update here.
To reproduce the issue with a contrived example:
The output of
echo
for a succesfull result, followed by a failed result:Keeping the connection open and sending a second message when the first reply was not sent, causes the first and second reply to be both be sent simultaneously:
Using bpftool to trace the program output, i can verify that #[sendmap_parser] is not called for the first packet in the problematic cases:
and otherwise:
Likewise when keeping the connection open, when receiving a delayed first reply together with the second reply, there is only one trace entries:
and otherwise, when receiving an instant reply, followed by the second reply, there are two trace entries:
Unfortunately I'm only taking a first look at redbpf I'm in no way qualified to tell what's going on here. Reducing sleep or removing it alltogether behaves as expected and makes echo fail all the time.
A less contrived example with real world use cases which indeed removes that sleep is
curl
;)