the-tcpdump-group / libpcap

the LIBpcap interface to various kernel packet capture mechanism
https://www.tcpdump.org/
Other
2.62k stars 838 forks source link

Sample ipv4 packet matched by several sample filters iff optimization is enabled. #430

Open kbara opened 9 years ago

kbara commented 9 years ago

Some filters with out-of-bounds packet access behave differently depending on if optimizations are enabled or not. In a two-clause expression with an out of bounds packet access in the first clause, joined by the 'or' operator to a clause that would match a packet, with the optimizer disabled, a filter never matches. With the optimizer enabled, the second clause is sometimes run, allowing the filter to match packets.

tcpdump version 4.5.1 libpcap version 1.5.3

The examples are run on a 66-byte tcp ack packet, which can be downloaded from https://github.com/Igalia/pflua/raw/master/tests/data/tcp-ack-66-bytes.pcap

"ether[1] > tcp[100] or ip" triggers this inconsistency; it matches the test packet with optimizations enabled, and does not match with optimizations disabled.

% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or ip" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or ip" 2>/dev/null | grep IP || echo "no match"
no match

Here are more test cases which show similar behavior. Note that the test packet is 66 bytes, so tcp[100], ether[100], and ip[100] are all out of bounds, and that the protocols match the packet.

The following expressions show problematic behavior:
ether[1] > tcp[100] or ip
ether[1] > tcp[100] or tcp
ether[1] > tcp[100] or 1 >= 1
ether[1] > tcp[100] or not arp
ether[100] > tcp[1] or tcp
ether[100] > tcp[100] or tcp
ip[100] > ether[1] or ip

Details:

% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or ip" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or ip" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or tcp" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or tcp" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or 1 >= 1" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or 1 >= 1" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or not arp" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or not arp" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[100] > tcp[1] or tcp" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[100] > tcp[1] or tcp" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[100] > tcp[100] or tcp" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[100] > tcp[100] or tcp" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ip[100] > ether[1] or ip" 2>/dev/null | grep IP || echo "no match"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ip[100] > ether[1] or ip" 2>/dev/null | grep IP || echo "no match"
no match
The following similar expressions show consistent behavior:
ether[1] > tcp[100] or ip[1] >= 0
ip[100] > ether[1] or tcp

Details:

% tcpdump -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or ip[1] >= 0" 2>/dev/null | grep IP || echo "no match"
no match
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ether[1] > tcp[100] or ip[1] >= 0" 2>/dev/null | grep IP || echo "no match"
no match

---
% tcpdump -ntr tcp-ack-66-bytes.pcap "ip[100] > ether[1] or tcp" 2>/dev/null | grep IP || echo "no match"
no match
% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ip[100] > ether[1] or tcp" 2>/dev/null | grep IP || echo "no match"
no match

Source to run the above tests:

#!/usr/bin/env python
import subprocess

def print_and_run(cmd):
    print("% " + cmd)
    subprocess.call(cmd, shell=True)

def gen_and_run():
    template_opt = 'tcpdump -ntr tcp-ack-66-bytes.pcap "%s" 2>/dev/null | grep IP || echo "no match"'
    template_unopt = 'tcpdump -O -ntr tcp-ack-66-bytes.pcap "%s" 2>/dev/null | grep IP || echo "no match"'

    prob_pfs = [ "ether[1] > tcp[100] or ip",
                 "ether[1] > tcp[100] or tcp",
                 "ether[1] > tcp[100] or 1 >= 1",
                 "ether[1] > tcp[100] or not arp",
                 "ether[100] > tcp[1] or tcp",
                 "ether[100] > tcp[100] or tcp",
                 "ip[100] > ether[1] or ip"
               ]
    print("The following expressions show problematic behavior:")
    print("\n".join(prob_pfs))
    print("\nDetails:\n")

    for pf in prob_pfs:
        print_and_run(template_opt % pf)
        print_and_run(template_unopt % pf)
    print("---")

    good_pfs = [ "ether[1] > tcp[100] or ip[1] >= 0",
                 "ip[100] > ether[1] or tcp"
               ]
    print("The following similar expressions show consistent behavior:")
    print("\n".join(good_pfs))
    print("\nDetails:\n")

    for gpf in good_pfs:
        print_and_run(template_opt % gpf)
        print_and_run(template_unopt % gpf)
    print("---")
gen_and_run()
guyharris commented 9 years ago

This has nothing to do with out of bound access - the BPF code generator optimizes away the "ether[1] > tcp[100]" test, so that test isn't even performed, regardless of whether there's packet data 100 bytes following the beginning of the TCP header or not.

guyharris commented 9 years ago

That doesn't happen for the expression "ether[1] > tcp[100]", but does happen for "ether[1] > tcp[100] or ip".

Somehow, if you do "ether[1] > tcp[100]" without the optimizer, it tests for both IPv4 and IPv6 TCP packets where ether[1] > tcp[100], but, with the optimizer, the IPv6 test is removed. If no IPv6 test is done, "ether[1] > tcp[100] or ip" is equivalent to just "ip", as all packets matched by "ether[1] > tcp[100]" are IP packets, but if the IPv6 test is done, there might be packets matched by "ether[1] > tcp[100]" that aren't matched by "ip", because they're IPv6 packets not matched by "ip" (you need "ip6" for that).

kbara commented 9 years ago

Is "ether[1] > tcp[100] or ip" really equivalent to ip, even given that it's an ipv4 packet? My understanding is that an out-of-bounds packet access makes the whole expression fail, not just the current clause, which invalidates some optimization ideas that would work if clauses were independent.

% tcpdump -O -ntr tcp-ack-66-bytes.pcap "ip[100] >= 0 or 1=1"
(no match)

Tangentially, this shows the same difference in behaviour with the optimizer on:

% tcpdump -ntr tcp-ack-66-bytes.pcap "ip[100] >= 0 or 1=1"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0
guyharris commented 9 years ago

BTW, if anybody's curious about the history of pflang, It's All Van And Steve's Fault. See the video and slides from Steve McCanne's keynote presentation at Sharkfest '11.

guyharris commented 9 years ago

Given that the behavior of which I speak can be demonstrated by using the -d flag to tcpdump, without running the filters on any packet data whatsoever, the out-of-bounds packet accesses are completely irrelevant. The optimizer doesn't take into account the possibility of a bounds check failure, so it simply optimizes out "ether[1] > tcp[100]" after having (incorrectly, I think) optimized out the test for IPv6 in that expression.

guyharris commented 9 years ago

It's probably best to think of Steve's little 6502-flavored machine as having "hardware" bounds checking, with bounds check faults causing a failure, and of the compiler as ignoring that characteristic, so that any "as if" rules in optimization ignore the possibly of a bounds check fault.

guyharris commented 9 years ago

The underlying problem is that, for Internet protocol family transport-layer protocols, use in general expressions such as "tcp[100]" generates code that tests only for IPv4, not for IPv6. As a comment in gen_load() says:

             * XXX - we should, if we're built with
             * IPv6 support, generate code to load either
             * IPv4, IPv6, or both, as appropriate.
kbara commented 9 years ago

Thank you for the link to the keynote; the slides on libpcap's history are quite interesting, and provide some perspective. But, going back to the bug report:

I see two different issues here, and I would appreciate if they were not conflated.

a) issues related to IPv6. As both man pcap-filter and the source say, there are a number of interesting things here, including many filters implying ipv4. I would appreciated if you would open a separate bug if you'd like to consider these issues.

b) I have provided fairly minimal test cases of one ipv4 packet and half a dozen tiny filters that match it when optimization is enabled, and do not match it when optimization is disabled. I would appreciate if this was confirmed and fixed, or if you would clearly explain why you do not consider it to be a bug and say that the behaviour I have reported is expected.

I've avoided speculating about the root cause; I've merely reported some test cases I've found that trigger this behaviour, and remarked on features they have in common. As you've pointed out, the test in the first clause is being optimized out; if that is the root cause of the behaviour I'm reporting, I would question whether that is a valid optimization, depending on the intended semantics of the language.

kbara commented 9 years ago

On the off-chance it's marginally interesting, tcp[10000] != 0 or tcp displays the same disparity in behaviour on the sample packet.

guyharris commented 9 years ago

I would question whether that is a valid optimization, depending on the intended semantics of the language.

I have seen nothing in the tcpdump man page, the pcap-filter man page, or the USENIX paper on BPF to indicate that the "and" and "or" operators in the filter language are to be treated as C-style short-circuit operators for which the order is important or that failure on an out-of-bounds reference is to be treated as behavior that cannot be optimized out by optimizing out the references. (I don't consider the fact that you can write "and" and "or" as "&&" and "||", using the same symbols as C-style short-circuit operators, to be explicit indications that they are to be treated as C-style short-circuit operators.) The BPF paper speaks of the ability to re-organize the control flow graph "so [a] value is only used at nodes that follow the original computation", which implies, for example, removal of redundant tests.

kbara commented 9 years ago

I'm not assuming C-style short-circuit behaviour; it's not consistent with what tcpdump shows with the optimizer enabled, although I can't think of an unoptimized expression off-hand which is not compatible with it or left-to-right evaluation of logical clauses.

"tcp or tcp[100000] != 0" matches the sample packet, with or without optimization, which is compatible with short-circuit behaviour.

On the other hand, tcp or 1/0 == 0's behaviour is interesting. With or without optimization, it diverges, in the opposite order from examples seen so far. The filter matches only with optimization disabled; this is compatible with, but not proof for, the idea that without optimizations, it is short-circuting, and with them it is not:

% tcpdump  -ntr tcp-ack-66-bytes.pcap "tcp or 1/0 == 0"
tcpdump: division by zero
% tcpdump  -O -ntr tcp-ack-66-bytes.pcap "tcp or 1/0 == 0"
IP 149.174.156.93.54192 > 178.79.150.233.80: Flags [.], ack 3209860838, win 31, options [nop,nop,TS val 2756387939 ecr 4173199779], length 0

Like you, I have not seen documentation which answers my questions about the semantics of the language. If clause order is not important, users of the language cannot even rely on adding explicit length checks before doing a packet access to avoid the divergences in behaviour in the out-of-bounds examples I've presented. And, for users, it's rather disconcerting to have basic filter expressions match or not match depending on whether optimization is enabled. It's not particularly great if developers can't tell whether a 12-character expression like ip or 1/0==0 should match packets or not, either; it's certainly at odds with the spirit of the presentation you linked to.

I think the key question here is whether what's happening is user-hostile and developer-hostile undefined behaviour, making the optimizations valid but destroying the idea of many filters having defined semantics, or whether having consistent behaviour is important and the optimizer needs to be redefined to take more things into account.

guyharris commented 9 years ago

The filter matches only with optimization disabled

The filter 1/0 == 0 is successfully compiled only with optimization disabled. Tcpdump doesn't even run if you optimize that one, as the compiler returns an error, so the filter, not being run, neither matches nor fails to match.

And short-circuiting has nothing to do with that - it happens regardless of the order in which things are tested:

$ tcpdump -d '1/0 == 0 or tcp'
tcpdump: division by zero
$ tcpdump -d 'tcp or 1/0 == 0'
tcpdump: division by zero

(the fact that tcpdump doesn't print out generated filter code means that no such code was generated; "division by zero" is an error string returned by pcap_compile(), not a string printed by the BPF interpreter when detecting a division by zero at run time).

The reason why the behavior is different when optimization is enabled and when it's not enabled is that the check for division and modulus by zero is done in the optimizer at the time it does compile-time evaluation of expressions. I've checked in a change to detect division by a constant zero at code generation time prior to optimization, so the checks are done regardless of whether optimization is enabled. That way, ip or 1/0 == 0 won't make it through the compiler under any circumstances, so developers and users will know that ip or 1/0 == 0 will not even be given the chance to match anything.

The key question here is whether run-time exceptions should be treated, by users and developers, as a mechanism for doing tests that match or don't match packets, so that code that could generate run-time bounds exceptions cannot be optimized away, or whether, if they want to do length checks, they should do so explicitly rather than relying on the bounds exception causing the entire program to fail rather than, for example, causing only tests using the value whose fetch caused the bounds exception to fail. I vote for the latter.

guyharris commented 9 years ago

The semantics should be that an subexpression that makes an out-of-bounds reference or gets a {divide,modulo}-by-zero error evaluates to false, rather than causing the entire expression to return false. Unfortunately, that's difficult to implement efficiently when generating code for the existing BPF machine; there are other things that are also difficult to implement, such as IPv6 protocol chasing, so the entire filtering mechanism needs a rethink.

kbara commented 9 years ago

I think that is probably too much of a breaking change, given the amount of history libpcap has, and how widely it is used. What do you think about the idea that "libpcap's optimizer should never change the accept/reject status of any given packet+filter pair, compared to unoptimized libpcap"?