Closed philrz closed 4 years ago
The failure occurs because directed-pairs.zng
is no longer sorted before its used to create an index; that can be addressed immediately by altering how it is created:
zar zq -o directed-pairs.zng "sum(count) as count by from,to | sort from,to" forward.zng reverse.zng
When we've given this demo, we've called out that we'd like to simplify these steps. One possibility is that we view the directed-pairs.zng
as temporary data, that doesn't need to persist. So we'd modify custom index creation to take one or more files, operate on them, ensuring their desired key fields, etc.
Also, some context here is that pre-#914, groupby outputs were sorted in "undefined but deterministic" order, where "undefined" meant "compare the byte-level representations of the values". In the case of IP addrs, the byte-level comparisons actually result in the numeric order that one would expect for IPs (unlike for some other zng types). So long story short, the fact that this worked previously was a happy accident.
I put https://github.com/brimsec/zq/pull/954 so that anyone following the readme would get the correct behavior.
I've verified that as of zq
commit 8cdcf4b
, a run-through the entire zar
README demo script is working again.
Thanks @alfred-landrum!
I was running a
zar
demo yesterday and it went off the rails when running through the simple graph queries step. I've done a binary search and determined that the regression is correlated with commit617047f
that came with @henridf's #914 ("Groupby: remove deterministic output order"). Waszar
somehow dependent on the previous "deterministic but undefined" order?Assuming the setup described in the
zar
README, the followingrepro.sh
script has the commands from the demo that make it easy to repro:As of commit
8dab96b
that came right before #914 was merged, the output lines up with what's in the README:Then at commit
617047f
, the output is short, different, and tellingly... not deterministic: