JulianSchmid / etherparse

A rust library for parsing ethernet & ethernet using protocols.
Apache License 2.0
299 stars 53 forks source link

"Downwards" parsing (currently) makes it difficult to work with unpopular protocols #32

Open karpawich opened 2 years ago

karpawich commented 2 years ago

The problem

The etherparse family of from functions parse a given u8 slice "downward" by design. This generally means that if etherparse knows the format of a payload at any point in the parsing process, it will parse it.

At first glance, this feature sounds great. One of the central tenets of the etherparse library is that it places a particular emphasis on the most popular packet-based protocols, and downwards parsing does just that. Suppose, for example, that you had a u8 slice containing a set of nested packets that all came from popular networking protocols, e.g. Ethernet -> IP -> TCP. With downward parsing, a Rust programmer could point a single from_ethernet function at this slice, and parse all three.

At the same time, the current implementation of downward parsing also inhibits the use of etherparse for unpopular or custom protocols. Suppose, now, that you wanted to use etherparse to implement an IP router that received IP packets along a set of interfaces and forwarded them along to their intended destination. You should not have to care about payload format. For each u8 slice that came in along an interface, you would naturally reach for the from_ip function to parse it into a SlicedPacket. The problem is the payload pointer. In my mind, I think that most Rust programmers in this situation would assume that the payload always pointed to the payload of the IP packet. But it does not. By design, if etherparse happened to recognize the format of the IP packet payload as another packet (e.g. a TCP packet), the payload pointer would instead point to the payload of that packet, because etherparse would have gone ahead and eagerly parsed it for you.

In summary:

  1. If you want to use etherparse to help you implement some part of the network layer -- in this specific case, an IP router -- you might have to violate the separation of concerns that is characteristic of the OSI model and dip into the transport layer.

  2. While the current implementation of downward parsing makes it easy to use etherparse for popular protocols, it does so at the cost of introducing what I would characterize as inconsistent and confusing behavior that makes it harder to use etherparse for unpopular protocols.

Possible solutions

As with #26, I am happy to do the work of implementing the solution to this problem. However, before I do, I want to discuss how I should go about solving it. Among others, I see at least two solutions:

  1. Do away with "downward" parsing.

  2. Split payload into separate pointers that each reliably point to the same payload.

I am personally in favor of the first solution, especially because even the packets of popular protocols do not always follow the particular "downwards" nesting order (Ethernet -> IP -> TCP) that etherparse assumes they do. For example, it is perfectly acceptable (and perhaps even somewhat common) to nest an IP packet inside of a TCP packet. (See Tunneling) Likewise, I think that is perfectly reasonable to ask Rust programmers to parse each payload themselves.

karpawich commented 2 years ago

What do you think @JulianSchmid ?