zeek / spicy

C++ parser generator for dissecting protocols & files.
https://docs.zeek.org/projects/spicy
Other
248 stars 37 forks source link

Provide a mechanism to synchronize two parse streams #1328

Open rsmmr opened 1 year ago

rsmmr commented 1 year ago

Use case is two sides of a connection where the parsing results from one side determine how the other side should proceed (e.g., STARTTLS by the client needs to be acked by the server before the TLS actually kicks in). Currently one needs to manually buffer any further data until the information about how to proceed becomes available, which is both very cumbersome and inefficient. The current thought is to introduce a semaphore-like mechanism that allows one side to wait for the other, yielding in the meantime. Details to be figured out.

rsmmr commented 1 year ago

I'm thinking to introduce the concept of a barrier to the Spicy language: a mechanism to block processing until multiple parsers have all arrived at a certain point. barrier would be a new type, and one could maintain an instance inside the current %context to synchronize the two sides of a connection. Quick straw man of the type:

Constructor:

barrier(n: uint64): Constructor creating a barrier synchronizing a given number of parties.

Methods:

barrier.wait(): Block until the expected number of parties have arrived at the barrier.
barrier.arrive(): Signal a party's arrival at the barrier.
barrier.arrive_and_wait(): First arrive(), then wait().
barrier.abort(): Signal failure to all waiters (current and future); this would trigger a BarrierBroken exception to all waiting parties.

When a barrier destructs, all currently still waiting parties would receive a BarrierTimeout exception.

In a %context type, this could be use like this:

type Context = struct {
  tls_handshake_done: barrier(2);
};
zeek-bot commented 1 year ago

This issue has been mentioned on Zeek. There might be relevant details there:

https://community.zeek.org/t/need-help-for-spicy-analyzer/7126/2

awelzel commented 4 months ago

The BinPac SSH analyzer in Zeek could leverage this feature as it has stream parsing dependencies to 1) determine if SSH v1 or v2 should be used based on the banners from both sides 2) determining how to continue parsing after seeing KEX_INIT from both sides

awelzel commented 4 months ago

And additional consideration for the barrier type is setups where traffic from just one side is visible (half-duplex traffic). This shouldn't cause indefinite buffering and likely raise parse errors after some amount of time/buffering.