the-tcpdump-group / libpcap

the LIBpcap interface to various kernel packet capture mechanism
https://www.tcpdump.org/
Other
2.71k stars 853 forks source link

Experimenting with BPF: Return Codes #985

Open splitice opened 3 years ago

splitice commented 3 years ago

Hi,

I'm at present exploring other applications for cbpf outside of filtering (ranking, sorting, etc) many of which require alternative return values (no longer just snaplen). As such I've been hand crafting BPF programs with multiple return values with some success.

To ease my research I've thought to patch libpcap to support some method of alternative return codes. At first I thought to introduce somethinf like "and return $value" before realizing the difficulty and restructuring of gencode.c that would be required to make this happen.

My second idea was to introduce or paths to the starting ldb in the icode and make pcap_compile support multiple c programs. Compiling each and adding them to the flow graph before optimization.

I was wondering if anyone had any feedback on this aproach? Do you think it might work? Do you think there is an easier hack?

I'm just looking for something to make my job a bit easier than full hand crafted expressions and prefably continue to allow access to the existing bpf optimizer.

splitice commented 3 years ago

After trying to develop a modified pcap_compile I decided to give the grammar modification a try and it turned out easier than expected.

It's not perfect (still creates final retblock for example) and it looks like the optimizer isnt exactly prepared for elimination of unreachable nodes (due to retblock). But it looks like it should work.

https://github.com/splitice/libpcap/commit/8fee68926ce36029342ef4134256ec898f869afa

I'll do some testing tomorrow on it however to be sure I havent broken anything.

infrastation commented 3 years ago

Are you trying to solve a practical problem or just experimenting?

splitice commented 3 years ago

@infrastation a practical problem if cbpf turns out to work well in the situation.

eBPF is also being evaluated but has nothing on the simplicity of a cbpf solution.

(tcp return 1) or (udp[4:2] == 1 return 2) or (return 3) is much simpler than the equivilent eBPF program.

fenner commented 3 years ago

I have thought about this some - I was going to try to solve it outside of libpcap as follows:

e.g., take:

tcpdump -d tcp -s 1 ->

(000) ldh      [12]
(001) jeq      #0x86dd          jt 2    jf 7
(002) ldb      [20]
(003) jeq      #0x6             jt 10   jf 4
(004) jeq      #0x2c            jt 5    jf 11
(005) ldb      [54]
(006) jeq      #0x6             jt 10   jf 11
(007) jeq      #0x800           jt 8    jf 11
(008) ldb      [23]
(009) jeq      #0x6             jt 10   jf 11
(010) ret      #1
(011) ret      #0

plus tcpdump -d 'udp[4:2]==1' -s 2 ->

(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 10
(002) ldb      [23]
(003) jeq      #0x11            jt 4    jf 10
(004) ldh      [20]
(005) jset     #0x1fff          jt 10   jf 6
(006) ldxb     4*([14]&0xf)
(007) ldh      [x + 18]
(008) jeq      #0x1             jt 9    jf 10
(009) ret      #2
(010) ret      #0

and in this case you can just replace the "ret #0" in the first set with the instructions from the second set. This will get you the return value you want but will check ethertype, etc. twice, thus the desire to see if the optimizer can optimize the sequence of instructions.

(I haven't even checked yet if this would require a custom libpcap build to be able to access the optimizer from outside of libpcap.)

Bill

On Tue, Jan 12, 2021 at 10:31 PM Mathew Heard notifications@github.com wrote:

@infrastation https://github.com/infrastation a practical problem if cbpf turns out to work well in the situation.

eBPF is also being evaluated but has nothing on the simplicity of a cbpf solution.

(tcp return 1) or (udp[4:2] == 1 return 2) or (return 3) is much simpler than the equivilent eBPF program.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/the-tcpdump-group/libpcap/issues/985#issuecomment-759181642, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLMLNRPFRLXHS2X5LXJK3SZUHZ3ANCNFSM4VOF7VUA .

mcr commented 3 years ago

I have thought about this some - I was going to try to solve it outside of libpcap as follows: - compile each program, with the snaplen reflecting the return value - concatenate the compiled results, replacing the "ret 0" with a jmp that goes to the next program - run that set of opcodes through the optimizer

It sounds like some kind of refactoring of gencode.c would help you.

@guyharris , as we are talking about API changes to libpcap, I wonder about if there is a way that we can introduce experimental things without promising to keep them.