brimdata / super

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.39k stars 64 forks source link

add vanilla csv to search endpoint #1276

Closed mccanne closed 4 years ago

mccanne commented 4 years ago

Add a csv writer path to the search endpoint that is used when the format in the API request is "csv".

The response mime type should be text/csv.

Given the clean design of zqd/search, this task should be trivial by following the pattern of the zng responder using an csvio.Writer and dropping control messages.

philrz commented 4 years ago

Verified in zqd commit 2608a46.

If a user wants to see CSV output, they now have two ways to functionally arrive at that:

$ zapi -s foo get -f csv > /tmp/f.csv
$ zapi -s foo get -e csv > /tmp/e.csv

$ ls -al /tmp/*.csv
-rw-r--r--  1 phil  wheel  1057 Sep 17 15:48 /tmp/e.csv
-rw-r--r--  1 phil  wheel  1057 Sep 17 15:48 /tmp/f.csv

$ diff /tmp/e.csv /tmp/f.csv
$ echo $?
0

The difference is that for -f csv, the response comes back from zqd as ZNG and it's up to the client to format it (into CSV in this case). As sniffed on the wire, the response begins:

Content-Type: application/x-zng
X-Request-Id: 15
Date: Thu, 17 Sep 2020 22:50:03 GMT
Transfer-Encoding: chunked
...

For -e csv, the response comes back from the server already rendered as CSV.

HTTP/1.1 200 OK
Content-Type: text/csv
X-Request-Id: 17
Date: Thu, 17 Sep 2020 22:51:01 GMT
Transfer-Encoding: chunked

421
_path,ts,peer,mem,pkts_proc,bytes_recv,pkts_dropped,pkts_link,pkt_lag,events_proc,events_queued,active_tcp_conns,active_udp_conns,active_icmp_conns,tcp_conns,udp_conns,icmp_conns,timers,active_timers,files,active_files,dns_requests,active_dns_requests,reassem_tcp_size,reassem_file_size,reassem_frag_size,reassem_unknown_size
stats,2018-03-24T17:35:20.601137Z,zeek,282,5467567,3398705931,-,-,-,1535999,1535998,4239,146,305,193639,4731,2510,879701,25895,35230,88,6,0,455128,0,0,0
...

As explained by @mccanne, this (along with the similar NDJSON response recently added) could allow users to make REST calls to zqd with everyday tooling (curl, Python, etc.) and start immediately processing the response as opposed to getting back ZNG or zjson and not knowing what to do with it or having to use a tool like zapi to convert it.

Thanks @mccanne!