Closed martinvonwittich closed 5 years ago
Thank you for a very thorough bug report!
First of all I should ask why you say this is a false positive. Isn't it a false negative? The IP address is being used, just by the pinging server as a special case. It becomes a false negative unless another machine on the network sends ARP requests (and thus gets replies), right?
The behaviour you see is because as you may already know arping doesn't check the source MAC of the reply. That would have been around here.
Now for the question: Should it? Maybe yes. If it does then it'll be unaffected by other machine's ARP request, and I agree this is probable the expected result. Maybe accept either requestor's MAC or the broadcast address.
Normally that should already be the behavior because promiscuous mode is off by default (-p
option), but there appears to be a special case when it's the local host sending it out.
But yes, it does seem like checking that the dst mac is the same as the outgoing request mac is the right thing to do.
I'll be a bit busy over the next few weeks, but I'll get to fixing this. (also pull requests welcome. Defaulting to new behaviour is fine, but I'll want a flag to let it continue to accept any reply. Or maybe the existing -p
is enough)
First of all I should ask why you say this is a false positive. Isn't it a false negative? The IP address is being used, just by the pinging server as a special case. It becomes a false negative unless another machine on the network sends ARP requests (and thus gets replies), right?
Hmm, I'm not sure, but I would argue that it is a false positive. The normal behavior IMO would be:
arping
sends out an ARP request:
test ~ # arping -r -c1 -C2 -w20000 -i enp1s0 172.17.56.10
test ~ #
The ARP request is sent out to the network:
test ~ # tshark -i enp1s0 -f arp
Running as user "root" and group "root". This could be dangerous.
tshark: Lua: Error during loading:
/usr/share/wireshark/init.lua:32: dofile has been disabled due to running Wireshark as superuser. See https://wiki.wireshark.org/CaptureSetup/CapturePrivileges for help in running
Wireshark as an unprivileged user.
Capturing on 'enp1s0'
1 0.000000000 02:00:ac:11:fb:18 → Broadcast ARP 42 Gratuitous ARP for 172.17.56.10 (Request)
^C1 packet captured
As long as there is no network problem (e.g. a loop), this packet isn't received by the sending host itself, and so Linux won't respond to this request because it originates from the local host.
This is a true negative - there is not response, and therefore arping
doesn't show any replies and exits with a non-zero exit code.
In the problem I've outlined above, there is no real response either - arping
just confuses a response from the local host directed to another machine as a response for its own request, and now incorrectly reports this as a response and exit with exit code 0. I would therefore call this a false positive :)
Now for the question: Should it? Maybe yes. If it does then it'll be unaffected by other machine's ARP request, and I agree this is probable the expected result.
I would also argue that this should be the default behavior, yes. I've considered suggesting a command-line switch to enable the new behavior, but the old behavior seems so wrong that I cannot imagine anyone actually would expect it :D
I've found mention of another customer server in our internal bug tracker, and there the problem is even more extreme. Apparently they have a lot of Cisco devices that immediately respond to ARP lookups from our server:
other-customer-server ~ # arping -c1 10.0.0.1
ARPING 10.0.0.1
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=0 time=11.446 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=1 time=11.463 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=2 time=11.469 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=3 time=11.474 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=4 time=11.478 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=5 time=11.483 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=6 time=11.487 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=7 time=11.495 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=8 time=11.500 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=9 time=11.506 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=10 time=11.510 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=11 time=11.515 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=12 time=11.522 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=13 time=11.527 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=14 time=11.532 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=15 time=11.537 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=16 time=11.543 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=17 time=11.548 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=18 time=11.553 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=19 time=11.558 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=20 time=11.563 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=21 time=11.568 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=22 time=11.572 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=23 time=11.577 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=24 time=11.584 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=25 time=11.589 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=26 time=11.594 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=27 time=11.599 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=28 time=11.604 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=29 time=11.609 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=30 time=11.614 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=31 time=11.619 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=32 time=11.624 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=33 time=11.628 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=34 time=11.633 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=35 time=11.638 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=36 time=11.643 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=37 time=11.648 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=38 time=11.653 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=39 time=11.658 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=40 time=11.663 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=41 time=11.668 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=42 time=11.673 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=43 time=11.678 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=44 time=11.682 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=45 time=11.687 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=46 time=11.693 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=47 time=11.698 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=48 time=11.702 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=49 time=11.708 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=50 time=11.713 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=51 time=11.718 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=52 time=27.428 msec
42 bytes from 4c:ed:fb:91:2a:c6 (10.0.0.1): index=53 time=523.472 msec
--- 10.0.0.1 statistics ---
1 packets transmitted, 54 packets received, 0% unanswered (53 extra)
rtt min/avg/max/std-dev = 11.446/21.362/523.472/69.003 ms
I'll be a bit busy over the next few weeks, but I'll get to fixing this.
Take your time, it's not really that significant of a problem. I've told our developers that they probably just should filter the server's own IPs from the response, that should solve the problem for us.
We use
arping
on customer systems to ensure that the server IP isn't used by any other devices in the LAN. The command we use looks like this:On one customer system, we've encountered a false positive -
arping
claims that the IP is used by the server itself:(IMO this shouldn't happen, because the server doesn't actually respond to its own request, as you can see in the
tshark
output below.)With
tshark
I was able to figure out that whenever the server sends an ARP lookup for its own IP, the DSL router automatically responds with its own ARP lookup for the server's IP, which the server then responds to:arping
is apparently confused by this and believes that the response (frame 3) to the DSL router's request (frame 2) is actually a response to its own request (frame 1).This problem is easily reproducible by having one
arping
instance ping its own server, and then anotherarping
instance on another server pinging the first server. For example, when I run this command on my test server to ping itself, it doesn't get any responses (as expected):But when I then run the following command on another server to ping my server:
Then suddenly the
arping
on my server shows responses: