implementation for dns log

Hi @wdweng,

The simplest thing you could do is convert your data to newline-delimited JSON then ingest that. That way everything should work for you out of the box without having to change code.

If you do want to directly ingest DNS logs there is a way to do that (talked about in my masters thesis) but it isn't very user friendly at the moment. You will have to write a parser and a serializer for your DNS logs following a certain programming model. Additionally you will have to change some parts of the code that currently assume every record is a JSON object, in particular here at ingestion, here during serialization, and here during search (and also here during search). Note that this is purely an issue with how the code is written right now -- the archive format itself can handle cases where records are not JSON.

When it comes to actually writing your parser and serializer you will first have to add a type to this enum -- this is the type that gets encoded into the Merged Parse Tree and indicates what type of structure is being represented.

For writing the parser hopefully this can act as reference -- in particular note the start_unordered_object and end_unordered_object calls that mark the start and end of parsing for this custom type. Between those calls you are free to call the "unordered" version of the functions to manipulate the schema and values in a record -- performing parsing in this way will guarantee that you see the same values in the same order at decompression time. You should hopefully just be able to follow what we do in the parse() function in that same file and just call your parser directly.

For serialization it might be a bit more difficult to replicate what we do since the code is very optimized for serializing JSON. Here in the code the variable m_global_id_to_unordered_object should have everything you need to initialize the serializer for your special type. You can see how we use this information to prepare to serialize structurized arrays here. After preparing to serialize objects from a given table the actual serialization code is here. I expect the details of what you'll want to do to initialize your serializer and actually serialize your data will be fairly different from what we do here.

Going forwards this should all become much simpler, but unfortunately support for custom parsing and serialization is not very mature right now.

y-scope / clp

implementation for dns log #606

Request

Possible implementation