Closed maribu closed 5 years ago
@miri64: Care to take a look?
Will do, but only after this week at the earliest. If it is pressing, feel free to fix it earlier.
There is code to prevent this happening but it relies on id
being unique per each execution of ping6
.
It's generating id
with luid_custom()
which does not return unique values:
https://github.com/RIOT-OS/RIOT/blob/6e9612f1d1e41ef56b79c1fcddc3d7e3aae2b9d7/sys/shell/commands/sc_gnrc_icmpv6_echo.c#L244
I replaced luid_custom()
with luid_get()
and it appears to have resolved the problem. Of course this still doesn't handle the case where MODULE_LUID
is not available.
Mh... I don't really get the problem. If there are duplicates ping
(even the iputils
implementation) shows them. If you spoof echo replies they are shown as well, but there is no way to distinguish spoofs from real if you use the correct addresses.
- On A:
ping6
node B and immediately afterwardsping6
some unreachable/unassigned IPv6 address
What does the latter accomplish? I should not get any responses, as there is no node to send them.
- On A: ping6 node B and immediately afterwards ping6 some unreachable/unassigned IPv6 address
What does the latter accomplish? I should not get any responses, as there is no node to send them.
If B sends duplicates for some extended time, and A stops pinging B but starts pinging unavailable node C, A will receive the delayed responses of B. But as the sender address is not matching C, A should ignore the replies (or give a debug message). ping6
however takes the delayed responses into account and will stop pinging when 3 responses (regardless of sender address) are receiced. If it only send 2 requests in that time, it will also print 133% success ratio...
If B sends duplicates for some extended time, and A stops pinging B but starts pinging unavailable node C, A will receive the delayed responses of B. But as the sender address is not matching C, A should ignore the replies (or give a debug message).
ping6
however takes the delayed responses into account and will stop pinging when 3 responses (regardless of sender address) are receiced. If it only send 2 requests in that time, it will also print 133% success ratio...
Ahhh, so the sender address not matching is the problem?
@maribu #11933 should fix the issue.
@maribu #11933 should fix the issue.
It should fix 2 but it wouldn't fix 1:
1) On A: ping6 node B more than once 1) On A: ping6 node B and immediately afterwards ping6 some unreachable/unassigned IPv6 address
2. On A: ping6 node B and immediately afterwards ping6 some unreachable/unassigned IPv6 address
Ok, but then I don't understand what is happening. If I ping an unreachable address, there should be no reply. As we now with #11933 check the source of the packet against the pinged address the second case should also be fixed. Setting the ID to a more node-specific one is in my eyes only patch-work to fix a sympton, but not the underlying issue (if two nodes have the same ID by chance the problem would still exist)
I mean the implementation of our ping is largely based on the inetutil
one and the original ping implementation (they just use the PID of the process) and they don't have these kind of problems.
@maribu can you supply with the code of your modified node B?
Yes the second case (pinging two different nodes) is fixed by source address checking, but not the first case (pinging the same node twice).
I don't actually know what is the intended use of id
or whether it's supposed to be unique per node or unique per invocation. But if we make it unique per invocation then it helps solve case 1:
ping6
to send a ping to any nodeping6
again to ping the same nodethey don't have these kind of problems
I agree and I think maybe it's worth noting that delayed pongs are required to trigger this issue, and Riot is probably the first OS I've encountered so many delayed pongs with, and that's maybe why it's an issue for us and not others. Usually it's just due to xtimer issues or other highly experimental stuff. In most normal cases and for most normal OSs I think it's likely that delayed pongs are so uncommon that this would be a nonissue.
Ok, sorry then I misunderstood you. id
should change "enough" to be distinguishable between invocations. As far as I understand luid_custom()
this should be the case, but luid
might be wrong there. I'll investigate.
Mhhh indeed luid_custom()
does not do what I thought it does. However, I think I go for a different approach then you propose, since a different ID is also not guaranteed if luid
is not provided, so I think we need a global counter.
(will provide a separate fix tomorrow)
@maribu can you supply with the code of your modified node B?
I used unmodified RIOT master with one node using an at86rf2xx (802.15.4 breakout header for RPi by OpenLabs) and a nrf52840-dk. I did not investigate further yet what the problem is yet, so I cannot provide any details.
But your steps to reproduce say
- Modify device B to send multiple responses over some extended period of time (e.g. seconds)
(will provide a separate fix tomorrow)
But your steps to reproduce say
- Modify device B to send multiple responses over some extended period of time (e.g. seconds)
Yes, and this is how I would reproduce the issue in the most generalized and controlled way ;-) I could provide a small program to trigger the issue using e.g. Linux RAW sockets. It would require node B to be a Linux host and node A to use e.g. Ethernet. But I won't have time for that until next week.
I think it should be possible to use a debugger to halt the CPU either during or after a ping reception and then unhalt it to release the pong at a controlled time.
I think it should be possible to use a debugger to halt the CPU either during or after a ping reception and then unhalt it to release the pong at a controlled time.
I did not test it that way, but whoever reviews #11938 could do so :-)
Apparently, just rewording the OP of the PR was not enough.
@benemorius @maribu please check if this is still an issue with #11933 and #11938 being merged.
Will do in a few hours hopefully. (Sorry for being so unresponsive. This week is super busy :-/)
Thanks for your fixes!
With current RIOT master sending pings using an nrf52840dk
to a Nucleo-F767ZI (with disabled stm32-eth) supplied with an AT86RF233 (OpenLabs RPi 802.15.4 breakout board):
(Notice how ping6
gets confused on the second run.)
Output of the Nucleo (debug in the AT86RF233 driver enabled):
The potential "driver issue" was as simple to solve as running the following to commands:
> ifconfig 7 -ack_req
> ifconfig 7 -autoack
on the Nucleo with the AT86RF233.
Sorry I didn't manage to test any of that until now. My result agrees with @maribu: id
has become unique per invocation now but the /* not our ping */
check fails to reject it.
It looks like #11938 worked correctly to reject same-node pongs with different ids but then #11933 broke the logic with a typo that only rejects a pong if it has both a wrong id and a wrong address: https://github.com/RIOT-OS/RIOT/blob/4a723fe4fd7a8f3d8a3376ad9a4bbabf396d11be/sys/shell/commands/sc_gnrc_icmpv6_echo.c#L341
On master if I change &&
to ||
I get the expected result that same-node pongs and wrong source address pongs are ignored.
To test this in a controlled setting I captured and injected ping replies with a Linux node connected via a border router to a Riot node using this procedure:
1) Prepare Linux node to capture its own outgoing ping reply:
tcpdump -i eth0 "icmp6 && ip6[40] == 129" -n -c1 -w icmp-reply.pcap
1) Use the border router to block the genuine ping reply from reaching the Riot node:
ip6tables -I FORWARD -o lowpan0 -p icmpv6 --icmpv6-type echo-reply -j DROP
1) Ping Linux node from Riot node using 1 ping and a 60 second timeout:
ping6 2001:470:4bb0:ffff::1 -c 1 -W 60000
1) On Linux node modify the source address of the captured packet:
tcprewrite --infile=icmp-reply.pcap --outfile=icmp-reply2.pcap --pnat=[2001:470:4bb0:ffff::1]:[2001:470:4bb0:ffff::2]
1) On border router unblock ping replies:
ip6tables -D FORWARD -o lowpan0 -p icmpv6 --icmpv6-type echo-reply -j DROP
1) On Linux node send modified ping reply:
tcpreplay --intf1=eth0 icmp-reply2.pcap
1) Observe whether the Riot node prints the reply. It should not.
12 bytes from 2001:470:4bb0:ffff::2 id:0xbe45/0xbe45 icmp_seq=0 ttl=63 rssi=-85 dBm time=13823.374 ms
1) As before, block ping replies at the border router:
ip6tables -I FORWARD -o lowpan0 -p icmpv6 --icmpv6-type echo-reply -j DROP
1) As before, send a ping from the Riot node:
ping6 2001:470:4bb0:ffff::1 -c 1 -W 60000
1) As before, unblock ping replies:
ip6tables -D FORWARD -o lowpan0 -p icmpv6 --icmpv6-type echo-reply -j DROP
1) From the Linux node send the unmodified ping reply captured earlier:
tcpreplay --intf1=eth0 icmp-reply.pcap
1) Observe whether the Riot node prints the reply. It should not.
12 bytes from 2001:470:4bb0:ffff::1 id:0xbe45/0x2e08 icmp_seq=0 ttl=63 rssi=-86 dBm time=52156.860 ms
At 7 and 12 Riot should not print the ping reply. But on master it prints both 7 and 12. After changing &&
to ||
on master, 7 and 12 both stop printing.
@maribu @benemorius Thanks for testing, it seems this was some wrongful due diligence on my side when testing #11933, I was also able to reproduce the findings and test the proposed fix. Do you wan't to open the PR yourself @miri64?
Has someone already opened a PR to change the &&
to ||
?
@maribu Im AFK for the next two weeks, do you think you or @benemorius can open the PR and review it?
@benemorius: Mind to open the PR? I'll test and review then.
Description
ping6
does check received pongs properlySteps to reproduce the issue
ping6
node B more than onceping6
node B and immediately afterwardsping6
some unreachable/unassigned IPv6 addressExpected results
ping6
ignores any pongs not matching a ping it send and filters out duplicatesActual results
ping6
correctly filters out duplicates (it does print them marked as duplicates but ignores them for the statistics - which is very nice design choice)ping6
treats every pong it receives as response (skipping duplicates):ping6
is counted as a response.ping6
are even reported even if the pinged address no longer matchesping6
might end after receiving 3 pongs without having send 3 pingsE.g. see the stats here:
Versions
(At least) current master