zeek / spicy

C++ parser generator for dissecting protocols & files.
https://docs.zeek.org/projects/spicy
Other
249 stars 37 forks source link

Reduce generated C++ code for bytes parsing #1837

Closed rsmmr closed 1 month ago

rsmmr commented 2 months ago

This reduces the amount of C++ code we generate for some cases of common fields, in particular simple bytes fields and literals where we don't need to account for look-ahead parsing. This introduces new machinery for implementing such narrow optimizations, which isn't quite as easy as one would hope because we need to reliably identify the cases when we can apply them.

rsmmr commented 2 months ago

@sethhall Take a look at this, it should help with the amount of generated code with your bytes fields.

sethhall commented 2 months ago

Oh nice!!!! I'll definitely be trying this soon.

sethhall commented 2 months ago

Ok, I tried this briefly. I generated a parser with one unit and 300 bytes fields in it. It took ~35s to compile (just compile, no linking) with the spicy that zeek is currently using and it took ~12.5s with this branch. The code is clearly much shorter too. It looks nice!

sethhall commented 2 months ago

Hm, now what I can't explain is that the Krb5 analyzer is taking longer to compile with spicyc (when I compile all the way from spicy code to an hlto file). This makes no sense at all. I'm going to have to dig a bit more on that to figure out what might be contributing to it taking longer.

It's not massively longer, but it's definitely consistently longer.

rsmmr commented 2 months ago

It's not massively longer, but it's definitely consistently longer.

Does -R show any noticeable difference somewhere that's not JIT?

sethhall commented 2 months ago

Does -R show any noticeable difference somewhere that's not JIT?

Nope. Nothing that stood out to me. I dug a little more last night but still nothing stood out. I'll keep digging.

sethhall commented 2 months ago

Oh, one other thought. It could be interesting to get that selective on-heap branch and this one together. I tried merging them yesterday but they have merge conflicts.

sethhall commented 2 months ago

Is this branch just waiting on code review or is there more work to do on it? Looks like it's lingered enough already that it has a conflict with main.

rsmmr commented 1 month ago

Is this branch just waiting on code review or is there more work to do on it?

Mostly I just didn't get to wrapping it up yet, but we do also still need to understand any compile-time differences, I've been meaning to look into that.

sethhall commented 1 month ago

Mostly I just didn't get to wrapping it up yet, but we do also still need to understand any compile-time differences, I've been meaning to look into that.

Cool, no rush. I was just curious.

rsmmr commented 1 month ago

need to understand any compile-time differences

I've tried it now, and I don't see a slow-down; in fact it's slightly faster for me with the branch. (I used CCACHE_DISABLE=1, otherwise it could be confusing). So I'll go ahead and wrap this up.