brimdata / zeek

Zeek is a powerful network analysis framework that is much different from the typical IDS you may know.
https://www.zeek.org
Other
2 stars 0 forks source link

print-types.zeek: handle nesting more than two levels deep #15

Closed philrz closed 8 months ago

philrz commented 4 years ago

This is a known limitation that is documented in print-types.zeek, but hadn't made its way over into this bug tracking system. The short summary is that the print-types script doesn't handle more than two levels of nesting. I had initially run into this with the openflow log, and here @philrz notes something similar with the smb_cmd log.

Repro is with the print-types.zeek script in Brim's fork of Zeek at commit e35de70 and zq commit 0397e42.

As of Zeek release v3.1.2, the smb_cmd.log is not generated by default. However, it can be brought to life by adding this line to local.zeek on a "stock" config:

@load policy/protocols/smb/log-cmds

Now we generate a typing config (attached):

# ZEEK_ALLOW_INIT_ERRORS=1 /usr/local/zeek-3.1.2/bin/zeek print-types.zeek /usr/local/zeek-3.1.2/share/zeek/site/local.zeek | tail +2 | jq | python -m json.tool > ~/work/zq/zeek/types-with-smb-cmd.txt

If I attempt to use it with zq commit 0397e42, it is rejected.

# zq -j ~/work/zq/zeek/types-with-smb-cmd.txt *
syntax error parsing type string

@henridf and I had discussed this at some point in the past at which point we noted that the cause of the error seems to lie within the referenced_file portion, as if I remove this it works:

# diff types-with-smb-cmd.txt types-with-smb-cmd-fixed.txt
2554,2598d2553
<                 "name": "referenced_file",
<                 "type": [
<                     {
<                         "name": "ts",
<                         "type": "time"
<                     },
<                     {
<                         "name": "uid",
<                         "type": "bstring"
<                     },
<                     {
<                         "name": "id",
<                         "type": "record conn_id"
<                     },
<                     {
<                         "name": "fuid",
<                         "type": "bstring"
<                     },
<                     {
<                         "name": "action",
<                         "type": "zenum"
<                     },
<                     {
<                         "name": "path",
<                         "type": "bstring"
<                     },
<                     {
<                         "name": "name",
<                         "type": "bstring"
<                     },
<                     {
<                         "name": "size",
<                         "type": "uint64"
<                     },
<                     {
<                         "name": "prev_name",
<                         "type": "bstring"
<                     },
<                     {
<                         "name": "times",
<                         "type": "record SMB::MACTimes"
<                     }
<                 ]
<             },
<             {
# zq -j ~/work/zq/zeek/types-with-smb-cmd-fixed.txt *
#0:record[_path:string,ts:time,peer:bstring,mem:uint64,pkts_proc:uint64,bytes_recv:uint64,pkts_dropped:uint64,pkts_link:uint64,pkt_lag:duration,events_proc:uint64,events_queued:uint64,active_tcp_conns:uint64,active_udp_conns:uint64,active_icmp_conns:uint64,tcp_conns:uint64,udp_conns:uint64,icmp_conns:uint64,timers:uint64,active_timers:uint64,files:uint64,active_files:uint64,dns_requests:uint64,active_dns_requests:uint64,reassem_tcp_size:uint64,reassem_file_size:uint64,reassem_frag_size:uint64,reassem_unknown_size:uint64,_write_ts:time]
0:[stats;1425565512.943615;bro;87;58;20905;-;-;-;410;13;0;0;1;0;0;1;36;31;0;0;0;0;0;0;0;0;1425565512.943615;]
...

The comments in print-types.zeek imply there's limits to how it deals with recursion, so perhaps the nested records like id inside referenced_file are a source of trouble.

Since this particular Zeek log isn't even enabled as default, this probably needn't be a high priority. However, since this same symptom might be lurking among other Zeek logs that aren't on by default, we may want to give it some consideration before too long, since use of zq with Zeek JSON is likely to start soon, and we expect to guide users through running print-types.zeek as needed to generate their own custom schemas.

philrz commented 8 months ago

The kind of shaping config output from this Zeek script is no longer supported in Zed, so it seems unlikely we'd ever go back and add this enhancement. While I've found this script still sometimes useful for getting a quick summary of additional default log types added in new Zeek versions, I've grown accustomed to performing the minimal, manual surgery to graft on deeply nested fields in modern Zed type definitions. Therefore I'm going to close out this issue.