dnstap / dnstap-ldns

reference dnstap decoding utility
Apache License 2.0
20 stars 8 forks source link

add JSON format outputFeature/json output #7

Closed hirachan closed 3 years ago

hirachan commented 3 years ago

-q has very few information, -y yaml has enough information but it is not easy to use by program. JSON format can be used easily for other next hop filtering applications such as fluentd.

edmonds commented 3 years ago

The -j flag isn't included in the usage() help output.

The querytime field in the JSON output is generated by calling ldns_pkt_querytime() but if I understand correctly this is retrieving an ldns_pkt_t field that will only ever be set to a non-zero value if you are using the LDNS library to send DNS queries, which dnstap-ldns does not do. So it will always be set to 0. I suggest just removing it.

The edns_extended_rcode field in the JSON output is odd and I think just an artifact of how LDNS splits the DNS response code value and makes the two parts available via its API. I don't think this field is necessary and the DNS response code (regardless of whether it uses the EDNS response code extension space) should be exposed as a single field in a proposed JSON format for dnstap.

Similarly I'm not quite sure what edns_udp_size is doing but I haven't tracked it down. I see it set to 0 about 99% of the time in a file with ~100K dnstap payloads from an unbound resolver. Not sure what this is supposed to represent.

dnstap log payloads are nested. There is an outer structure with metadata and an inner payload of a particular type. See https://github.com/dnstap/dnstap.pb/blob/master/dnstap.proto. The YAML output format replicates this structure but this JSON output format does not and puts fields into a single JSON object. I think a JSON representation of dnstap payloads should also replicate the nested structure.

The extra timestamp field (@timestamp) looks like a convenience field for a particular application that uses JSON that needs a timestamp in a different format.

The dnstap-ldns tool was originally intended to be a reference utility. If a JSON representation of dnstap payloads is needed it probably makes sense to spend a bit more time designing a format and getting consensus via the mailing list before merging something.

vixie commented 3 years ago

If a JSON representation of dnstap payloads is needed it probably makes sense to spend a bit more time designing a format and getting consensus via the mailing list before merging something.

+1.

cmikk commented 3 years ago

Noted, and agreed.

Given that this is merged already, my inclination is to revert this merge (returning dnstap-ldns to reference status), open a discussion on the dnstap mailing list regarding a standard JSON (and YAML?) encoding for dnstap data, and adapt both dnstap-ldns and golang-dnstap to any agreed-on format.

Barring any objections, I will do this revert and start the discussion next week.

@hirachan: you will need to continue using your JSON fork in the interim, and the JSON support will disappear from the main dnstap-ldns, and possibly be replaced with a different JSON format.

hirachan commented 3 years ago

Sorry for late reply.

Ok, I agreed. I also think designing JSON format is not easy. Please let me know if you guys re-implement JSON format, I will join discussion, and help implementations.