seladb / PcapPlusPlus

PcapPlusPlus is a multiplatform C++ library for capturing, parsing and crafting of network packets. It is designed to be efficient, powerful and easy to use. It provides C++ wrappers for the most popular packet processing engines such as libpcap, Npcap, WinPcap, DPDK, AF_XDP and PF_RING.
https://pcapplusplus.github.io/
The Unlicense
2.63k stars 639 forks source link

Add HTTP2 support #1210

Open jpcofr opened 9 months ago

jpcofr commented 9 months ago

@seladb I noticed that PcapPlusPlus does not support HTTP2 and someone requested that through the google group. It seems also that there are not active PRs for this. Do you know if someone is already working on this? If not, I can start working on it...

seladb commented 9 months ago

The main problem with HTTP/2 (formerly known as SPDY) in the context of analyzing and parsing is that this protocol is encrypted by design, unlike previous versions of HTTP which weren't encrypted and needed an additional encryption layer known as SSL/TLS.

We currently don't have a way to decrypt traffic, and even if we had I'm not sure how it'd apply to HTTP/2.

Or maybe you think there is value in parsing the encrypted messages, like PcapPlusPlus already does for protocols like SSL/TLS, SSH, IPSec, etc.? 🤔

jpcofr commented 9 months ago

The main problem with HTTP/2 (formerly known as SPDY) in the context of analyzing and parsing is that this protocol is encrypted by design, unlike previous versions of HTTP which weren't encrypted and needed an additional encryption layer known as SSL/TLS. We currently don't have a way to decrypt traffic, and even if we had I'm not sure how it'd apply to HTTP/2.

I know that Wireshark is able to decrypt when the MAGIC frames in the stream are present. Even when it cannot decrypt, the HTTP2 frames show some unformatted payload. As for how to do it in PcapPlusPlus, I have no idea and that would be part the investigation during a PR.

Or maybe you think there is value in parsing the encrypted messages, like PcapPlusPlus already does for protocols like SSL/TLS, SSH, IPSec, etc.? 🤔

Yes! my initial interest in this project is to be able to reconstruct payloads from HTTP2. (e.gr. be able to reconstruct JSON payload easily for 3GPP 5G Network Function interactions) I still need to review the code more carefully. Assuming that the HTTP functionalities are incomplete (I saw in the docs that HTTP only uses headers) I propose the following:

If you agree, just create these two tasks and assign them to me. This time I'll branch dev...

seladb commented 9 months ago

I think that Wireshark uses OpenSSL to decrypt data, not sure about HTTP2 though... however, we don't want to add OepnSSL as a dependency because it will make the build process more complex. If you find a way to decrypt without an external dependency then I'm all for it!

Regarding full HTTP implementation - the reason only headers are supported is because often full HTTP messages (especially responses) spread over more than one packet but the parsing in PcapPlusPlus is done packet-by-packet. Of course, HTTP headers can also spread over more than one packet (and the parser supports that), but this is a more rare edge case.

Anyway - feel free to start working on it! We don't have to open tickets for it, but we can if you prefer 😃

jpcofr commented 9 months ago

If you find a way to decrypt without an external dependency then I'm all for it!

seems this need require a lengthy investigation... I may look into that later after I have checked what is required.

Anyway - feel free to start working on it! We don't have to open tickets for it, but we can if you prefer

Yes, please open the ticket and I'll start pushing stuff soon.

seladb commented 9 months ago

@jpcofr actually we can use this ticket for HTTP2 😄 I just assigned it to you

jpcofr commented 8 months ago

@seladb I think you also need to add either the PRs (#1213, #1212) or the remote branches to this issue. I'd like to keep a track of my contributions for my GitHub profile, if possible. 🙂

seladb commented 8 months ago

@seladb I think you also need to add either the PRs (#1213, #1212) or the remote branches to this issue. I'd like to keep a track of my contributions for my GitHub profile, if possible. 🙂

I assigned both PR to this issue, let me know if that's ok with you

jpcofr commented 8 months ago

@seladb I'm currently generating traces using a Python client/server script I wrote. How about you? How do you create your test traces?

seladb commented 8 months ago

@seladb I'm currently generating traces using a Python client/server script I wrote. How about you? How do you create your test traces?

What do you mean by traces?

jpcofr commented 8 months ago

What do you mean by traces?

Sample captures as in Wireshark webpage. I think I may need to generate samples to test corner cases. I meant that I know how to create simple traces with Python but I may need to generate tweaked samples easily.

seladb commented 8 months ago

How do you create sample captures with Python? 🤔

Usually for new protocols, I try to find captures online to make sure the data is real

jpcofr commented 8 months ago

How do you create sample captures with Python? 🤔

Ok, I stretched it a bit... I wrote two (server/client) ~50 lines scripts that barely work and just send some frames. I capture these using Wireshark and then I call it "generate a sample". The problem, in the general case, is too wide IMO...

seladb commented 8 months ago

yeah, usually I just google it and try to find pcap files with the specific protocol. This is usually easier than trying to generate packets myself...

jpcofr commented 8 months ago

@seladb I found a traffic generator... it seems it may be useful... but I do not have time to test it... it is even on github Cisco trex

seladb commented 8 months ago

Yes, I've heard about this traffic generator but never used it...

tigercosmos commented 2 months ago

remove the assignment due to lack of activity