brimdata / super

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.39k stars 64 forks source link

uid correlation query failing: slice bounds out of range #724

Closed philrz closed 4 years ago

philrz commented 4 years ago

Found on Brim tagged v0.9.0 talking to zqd tagged 0.13.0. It doesn't seem to trigger with all pcaps, but I can reliably repro with https://archive.wrccdc.org/pcaps/2018/wrccdc.2018-03-23.010014000000000.pcap.gz. I first saw this when smoke testing on Linux, but just reproduced on macOS as well.

Once uncompressed and imported into Brim, the Wireshark button does not activate for any events I click on. When I sniff in Wireshark, I see an error coming back from zqd which I can repro with curl:

$ curl "http://localhost:9867/search?format=zjson" -H "Content-Type: text/plain;charset=UTF-8" -d '{"proc":{"op":"SequentialProc","procs":[{"op":"FilterProc","filter":{"op":"LogicalOr","left":{"op":"LogicalOr","left":{"op":"LogicalOr","left":{"op":"CompareField","comparator":"=","field":{"op":"FieldRead","field":"uid"},"value":{"op":"Literal","type":"string","value":"CeAR584pP6sHvgnrLe"}},"right":{"op":"CompareField","comparator":"in","field":{"op":"FieldRead","field":"conn_uids"},"value":{"op":"Literal","type":"string","value":"CeAR584pP6sHvgnrLe"}}},"right":{"op":"CompareField","comparator":"in","field":{"op":"FieldRead","field":"uids"},"value":{"op":"Literal","type":"string","value":"CeAR584pP6sHvgnrLe"}}},"right":{"op":"CompareField","comparator":"=","field":{"op":"FieldCall","fn":"RecordFieldRead","field":{"op":"FieldRead","field":"referenced_file"},"param":"uid"},"value":{"op":"Literal","type":"string","value":"CeAR584pP6sHvgnrLe"}}}},{"op":"HeadProc","count":100}]},"space":"wrccdc.2018-03-23.010014000000000.pcap.brim","dir":-1,"span":{"ts":{"sec":1521835102,"ns":636000000},"dur":{"sec":111,"ns":696000000}}}'
{"type":"TaskStart","task_id":0}

{"type":"TaskEnd","task_id":0,"error":{"type":"INTERNAL","kind":"","error":"panic: runtime error: slice bounds out of range [44:2]"}}
alfred-landrum commented 4 years ago

The panic that's occurring looks like:

runtime/debug.Stack(0xc000530928, 0x16733e0, 0xc000039e80)
    /usr/local/Cellar/go/1.13.6/libexec/src/runtime/debug/stack.go:24 +0x9d
github.com/brimsec/zq/driver.(*Mux).safeGet.func1(0xc000530f70)
    /Users/alfred/work/zq/driver/mux.go:48 +0x54
panic(0x16733e0, 0xc000039e80)
    /usr/local/Cellar/go/1.13.6/libexec/src/runtime/panic.go:679 +0x1b2
github.com/brimsec/zq/zcode.(*Iter).Next(0xc000530ac8, 0xc00000f700, 0x17b43e0, 0x2169ea0, 0xc00000f700, 0xc0004acd04, 0x2)
    /Users/alfred/work/zq/zcode/iter.go:31 +0x22c
github.com/brimsec/zq/filter.Contains.func1(0x17b4820, 0xc00000f700, 0xc0004acd04, 0x2, 0x1f2fc, 0x1f2fc)
    /Users/alfred/work/zq/filter/compare.go:442 +0xf7
github.com/brimsec/zq/filter.combine.func1(0xc000304f60, 0x14baf00)
    /Users/alfred/work/zq/filter/filter.go:37 +0x89
github.com/brimsec/zq/filter.LogicalOr.func1(0xc000304f60, 0xc000530c18)
    /Users/alfred/work/zq/filter/filter.go:23 +0x65
github.com/brimsec/zq/filter.LogicalOr.func1(0xc000304f60, 0x0)
    /Users/alfred/work/zq/filter/filter.go:23 +0x38
github.com/brimsec/zq/filter.LogicalOr.func1(0xc000304f60, 0xc000304f60)
    /Users/alfred/work/zq/filter/filter.go:23 +0x38
github.com/brimsec/zq/scanner.(*Scanner).Read(0xc000108540, 0xc0001e2300, 0x0, 0x0)
    /Users/alfred/work/zq/scanner/scanner.go:72 +0x1c7
github.com/brimsec/zq/zbuf.ReadBatch(0x17a4c80, 0xc000108540, 0x64, 0x0, 0x0, 0x0, 0x0)
    /Users/alfred/work/zq/zbuf/batch.go:28 +0xcc
github.com/brimsec/zq/scanner.(*Scanner).Pull(0xc000108540, 0x0, 0x0, 0x0, 0x0)
    /Users/alfred/work/zq/scanner/scanner.go:47 +0xb9
github.com/brimsec/zq/proc.(*Base).Get(0xc000128280, 0x0, 0x0, 0x0, 0x0)
    /Users/alfred/work/zq/proc/proc.go:76 +0x38
github.com/brimsec/zq/proc.(*Head).Pull(0xc000128280, 0x0, 0x0, 0x0, 0x0)
    /Users/alfred/work/zq/proc/head.go:23 +0x69
github.com/brimsec/zq/proc.(*Base).Get(0xc000128300, 0x0, 0x0, 0x0, 0x0)
    /Users/alfred/work/zq/proc/proc.go:76 +0x38
github.com/brimsec/zq/driver.(*Mux).safeGet(0xc000128300, 0x0, 0x0, 0x0, 0x0)
    /Users/alfred/work/zq/driver/mux.go:52 +0x74
github.com/brimsec/zq/driver.(*Mux).run(0xc000128300)
    /Users/alfred/work/zq/driver/mux.go:65 +0x32
created by github.com/brimsec/zq/driver.(*MuxOutput).Pull.func1
    /Users/alfred/work/zq/driver/mux.go:104 +0x5d
alfred-landrum commented 4 years ago

I ingested the wrccdc.2018-03-23.010014000000000.pcap 500MB file via Brim v0.9.0, and verified that I was seeing the Wireshark & errors reported above. Then I used zq v0.13.0 to execute a query from the command line:

$ zq -t _path=conn ./all.zng
...
#2:record[_path:string,ts:time,fuid:bstring,tx_hosts:set[ip],rx_hosts:set[ip],conn_uids:set[bstring],source:bstring,depth:uint64,analyzers:set[bstring],mime_type:bstring,filename:bstring,duration:duration,local_orig:bool,is_orig:bool,seen_bytes:uint64,total_bytes:uint64,missing_bytes:uint64,overflow_bytes:uint64,timedout:bool,parent_fuid:bstring,md5:bstring,sha1:bstring,sha256:bstring,extracted:bstring,extracted_cutoff:bool,extracted_size:uint64]
2:[conn;1521835200.430607;CprxT64yLXfNg1neii;[10.237.102.3;<ZNG-ERR type ip [%!s(PANIC=String method: runtime error: slice bounds out of range [24:2])]: failure trying to decode IP address that is not 4 or 16 bytes long>;10.47.1.59;<ZNG-ERR type ip [%!s(PANIC=String method: zcode encoding has bad format: bad uvarint: 0)]: failure trying to decode IP address that is not 4 or 16 bytes long>;]panic: runtime error: slice bounds out of range [:58] with capacity 29

goroutine 1 [running]:
github.com/brimsec/zq/zcode.(*Iter).Next(0xc0005397c0, 0x0, 0x0, 0x0, 0x1, 0xc00032f0a8, 0x0)
    /Users/alfred/work/zq/zcode/iter.go:30 +0x242
github.com/brimsec/zq/zng.(*TypeSet).StringOf(0xc00000fbe0, 0xc0002e3de3, 0x3, 0x1d, 0x1, 0xc0002e3d00, 0x0, 0x1e)
    /Users/alfred/work/zq/zng/set.go:62 +0x186
github.com/brimsec/zq/zng.Value.Format(...)
    /Users/alfred/work/zq/zng/value.go:77
github.com/brimsec/zq/zio/tzngio.(*Writer).writeValue(0xc00000f9a0, 0x1559660, 0xc00000fbe0, 0xc0002e3de3, 0x3, 0x1d, 0x0, 0x0)
    /Users/alfred/work/zq/zio/tzngio/writer.go:154 +0x1e4
github.com/brimsec/zq/zio/tzngio.(*Writer).writeContainer(0xc00000f9a0, 0x1559620, 0xc00041aa80, 0xc0002e3db0, 0x50, 0x50, 0x1, 0x2)
    /Users/alfred/work/zq/zio/tzngio/writer.go:141 +0x3a3
github.com/brimsec/zq/zio/tzngio.(*Writer).Write(0xc00000f9a0, 0xc000450de0, 0x0, 0x0)
    /Users/alfred/work/zq/zio/tzngio/writer.go:55 +0x19b
github.com/brimsec/zq/driver.(*CLI).Write(0xc0001c7b00, 0x0, 0x155b3c0, 0xc000531bf0, 0x0, 0x0)
    /Users/alfred/work/zq/driver/driver.go:75 +0xcd
github.com/brimsec/zq/driver.Run(0xc0000d0910, 0x1557be0, 0xc0001c7b00, 0x0, 0xc00000f9c0, 0xc0000d08c0)
    /Users/alfred/work/zq/driver/driver.go:46 +0x248
main.(*Command).Run(0xc0000f4370, 0xc0000200e0, 0x2, 0x2, 0x0, 0x0)
    /Users/alfred/work/zq/cmd/zq/zq.go:248 +0x811
github.com/mccanne/charm.(*instance).run(0xc00000eaa0, 0xc0000200d0, 0x3, 0x3, 0x0, 0x0)
    /Users/alfred/src/go/pkg/mod/github.com/mccanne/charm@v0.0.3-0.20191224190439-b05e1b7b1be3/instance.go:53 +0x28c
github.com/mccanne/charm.(*Spec).ExecRoot(0x1878480, 0xc0000200d0, 0x3, 0x3, 0xc0000e3f50, 0x100741f, 0xc000092058, 0x0)
    /Users/alfred/src/go/pkg/mod/github.com/mccanne/charm@v0.0.3-0.20191224190439-b05e1b7b1be3/charm.go:77 +0x96
main.main()
    /Users/alfred/work/zq/cmd/zq/main.go:9 +0x78

I then installed with zq v0.12.0, and saw the same error as above.

So I think that the error is not in executing the query from the frontend, but a problem in the file itself. We saw a similar error of writing a bad zng file during the development of the external sort work: https://github.com/brimsec/zq/pull/527

alfred-landrum commented 4 years ago

I increased the maximum memory usage for sort such that no spills occurred with my test pcap, and still saw the same errors as above, so this doesn't appear to be external sort related.

philrz commented 4 years ago

Verified in Brim commit 61fe048 talking to zqd commit f62f12f. Per the attached video, now when I import the wrccdc pcap, I can click on events and the Wireshark button activates, indicating the backend query completed without error now.

Verify.mp4.zip

Thanks @mccanne!