brimdata / super

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.39k stars 64 forks source link

Change standard file extensions for ZNG files #544

Closed alfred-landrum closed 4 years ago

alfred-landrum commented 4 years ago

The ZNG spec defines binary and text formats for files. We currently use 'bzng' as the extension for binary and 'zng' as the extension for text. The binary format is more widely used; the text format is currently used for testing, debugging, or demos.

We discussed internally, and thought it would make sense to change the 'standard' extensions now: we'll use 'zng' as the extension for the binary format, and 'tzng' for the text format. We can continue to interpret an extension of 'bzng' as the binary format.

philrz commented 4 years ago

Also implies updating the ZNG spec.

philrz commented 4 years ago

Verified in zq commit 407b5f6.

This was changed in too many places to realistically verify all of them, but here's the heavy hitters. In the zq help output, we see how the references to bzng are gone, and now the default text-based-zng output is known as tzng.

# zq 2>&1 | grep -i zng
    -b limit for number of records in each ZNG stream(0 for no limit) (default "0")
    -f format for output data [zng,ndjson,table,text,types,zeek,zjson,tzng] (default "tzng")
    -i format of input data [auto,zng,ndjson,zeek,zjson,tzng] (default "auto")
    Supported input formats include binary and text zng, NDJSON, and
    The output format is text zng by default, but can be overridden with

# echo '{"foo": "bar"}' | zq -
#0:record[foo:string]
0:[bar;]

# echo '{"foo": "bar"}' | zq -f zng -
?foo   bar?

# echo '{"foo": "bar"}' | zq -f bzng -
unknown output format: bzng

Also, loading a pcap with Brim commit 870b60d, we see that the sorted event data is now stored in a file all.zng instead of all.bzng to represent the change of binary format name.

$ ls -l
total 328912
-rw-r--r--  1 phil  staff  152977471 Apr 24 15:53 all.zng
-rw-r--r--  1 phil  staff         92 Apr 24 15:50 info.json
-rw-r--r--  1 phil  staff     622381 Apr 24 15:50 packets.idx.json
philrz commented 4 years ago

Update: With the merge of https://github.com/brimsec/zq/pull/656 it is confirmed that the only reference to "bzng" in the zq code is in the one place where we check for files named all.bzng that might have been left over in users' older Brim data directories.

image