Closed philrz closed 5 months ago
Thanks @nwt! I'll accept the approval despite the light coverage on the shaper part. @mattnibs has been looking that over with an eye toward possible language improvements that might make it more self-documenting. But the old shaper was so out-of-date and I'm pretty confident the new one offers better functionality and error handling, so I'm keen to get this merged so I can point users at it sooner and we can keep making improvements over time. I also know I'm on the hook to probably be the exclusive supporter of this stuff since you guys rightfully have other things on your minds besides Zeek stuff. 😉
tl;dr
This is a significant update to the Zeek integration docs, in particular bringing the shaper script current with Zeek v6.2.0. You can see the rendered version of them here.
Details
The changes in this PR are the long overdue "material changes" foretold in #4694. In addition to bringing the type definitions up to date with current GA Zeek release v6.2.0, I've made the shaper more compact and also started using Zed's
error
type to surface problems encountered during shaping. I think this all provides a solid working example of many Zed concepts coming together to solve a challenging problem. Now that build-zeek has me in the habit of keeping up with GA Zeek releases, I'm hopeful that I'll be able to keep this shaping doc current as the log types gradually change and avoid having to do grand overhauls like this going forward.In addition to the changes to the Zeek shaper itself and how it's described, I've made some general improvements to the Zeek integration docs to fix typos, add links, and bring them more current with the evolving state of the other Zed docs. In some cases this actually involved removing some text since we've got better coverage of topics in their proper homes, e.g., we now have the detailed Shaping and Type Fusion doc whereas in the past the Zeek shaper doc effectively served as the most comprehensive doc about shaping. There's still a little redundancy in the Zeek shaper doc because I figured it was helpful to present the concepts in context like a "user guide" rather than sending the reader on a scavenger hunt through reference materials, though of course I still link off to all the relevant functions/operators/etc.
If reviewers would like to see the rendered docs rather than trying to pick through the diffs here, I've pushed a built copy of the docs site based on this branch to a personal staging site at https://6616d57a0260af2ee74d1a3e--spiffy-gnome-8f2834.netlify.app/docs/next/integrations/zeek.
How it was done
Here's some notes-to-self on how I came up with the changes here, as I expect they may come in handy the next time I do this.
While its output is no longer directly usable in Zed tooling, the print-types.zeek script from our (now archived) Zeek repo remains useful for assessing the default fields/types output by a particular Zeek release. To gather these for the old/new endpoints of this exercise I ran this on Zeek v4.1.1:
And on Zeek v6.2.0:
Then check for differences:
If an entirely new log type is spotted in the diff (e.g.,
ldap
in this case) or an existing log type is overhauled significantly, the lines that define the descriptor array for that log type were copied from thetypes-6.2.0.json
to a separate file, then run through this pipeline in a scriptcleanup-type.sh
:For example:
Because of the known limitation of
print-types.zeek
described in https://github.com/brimdata/zeek/issues/15, I also manually eyeballed the type definition for Zeek'sopenflow
and confirmed nothing has changed since the last shaper update.