Closed alfred-landrum closed 4 years ago
The zdx file would become a concatenation of streams:
Creating the zdx would work like this:
To use this to find keys, a zdx reader would read the zdx file, read the super-block to follow the offsets to the btree toc, then execute a btree search.
Verified in zq
commit 04f2c11
.
Revisiting how things looked at zq
commit 90dddfc
right before this change, after creating the micro-indexes as shown in the zar
README, note the presence of the .1
files.
$ zq zng/*.gz | zar import -s 25MB -
$ zar index :ip
$ zar index uri
$ zar index -q -o custom -k id.orig_h -z "count() by _path, id.orig_h | sort id.orig_h"
$ tree -s logs/
logs/
├── [ 320] 20180324
│ ├── [ 4425007] 1521911772.980384.zng
│ ├── [ 192] 1521911772.980384.zng.zar
│ │ ├── [ 6303] custom.zng
│ │ ├── [ 119] zdx-field-uri.1.zng
│ │ ├── [ 96953] zdx-field-uri.zng
│ │ └── [ 29732] zdx-type-ip.zng
│ ├── [ 25001925] 1521912075.114273.zng
│ ├── [ 192] 1521912075.114273.zng.zar
│ │ ├── [ 8424] custom.zng
│ │ ├── [ 59] zdx-field-uri.1.zng
│ │ ├── [ 124555] zdx-field-uri.zng
│ │ └── [ 15699] zdx-type-ip.zng
│ ├── [ 25007413] 1521912507.399929.zng
│ ├── [ 224] 1521912507.399929.zng.zar
│ │ ├── [ 12323] custom.zng
│ │ ├── [ 143] zdx-field-uri.1.zng
│ │ ├── [ 172595] zdx-field-uri.zng
│ │ ├── [ 41] zdx-type-ip.1.zng
│ │ └── [ 80764] zdx-type-ip.zng
│ ├── [ 25005195] 1521912990.158766.zng
│ └── [ 160] 1521912990.158766.zng.zar
│ ├── [ 7538] custom.zng
│ ├── [ 62757] zdx-field-uri.zng
│ └── [ 28392] zdx-type-ip.zng
└── [ 778] zar.json
5 directories, 21 files
Repeating the same steps at zq
commit 04f2c11
that has this enhancement, the .1
files are now gone.
$ tree -s logs/
logs/
├── [ 320] 20180324
│ ├── [ 4425007] 1521911772.980384.zng
│ ├── [ 160] 1521911772.980384.zng.zar
│ │ ├── [ 6347] custom
│ │ ├── [ 97012] zdx-field-uri.zng
│ │ └── [ 29766] zdx-type-ip.zng
│ ├── [ 25001925] 1521912075.114273.zng
│ ├── [ 160] 1521912075.114273.zng.zar
│ │ ├── [ 8468] custom
│ │ ├── [ 124614] zdx-field-uri.zng
│ │ └── [ 15733] zdx-type-ip.zng
│ ├── [ 25007413] 1521912507.399929.zng
│ ├── [ 160] 1521912507.399929.zng.zar
│ │ ├── [ 12367] custom
│ │ ├── [ 172703] zdx-field-uri.zng
│ │ └── [ 80825] zdx-type-ip.zng
│ ├── [ 25005195] 1521912990.158766.zng
│ └── [ 160] 1521912990.158766.zng.zar
│ ├── [ 7582] custom
│ ├── [ 62792] zdx-field-uri.zng
│ └── [ 28426] zdx-type-ip.zng
└── [ 794] zar.json
5 directories, 17 files
Thanks @mccanne!
Filing from a live discussion from https://github.com/brimsec/zq/pull/600 : We could implement the zdx bundle as a single file, with potentially very little work, by concatenating the btree files with base file. In this case, the offsets stored would be interpreted as offsets in the next stream of the overall zdx file.