phaag / go-nfdump

go-nfdump: A Go module to read and process nfdump files
BSD 2-Clause "Simplified" License
7 stars 1 forks source link

Compat16 #14

Open mindsur opened 3 months ago

mindsur commented 3 months ago

We are still heavily using nfdump files with older format. So I have attempted to implement parsing of DataBlock type 2 and CommonRecord types based on the original C code. I am quite new to GO and complete newbie in C so this might not be most accurate implementation.

Next, I still want to be able to process multiple files like it's done in original C code with "-M" and some way to aggregate flows with "-a". Also, planning to do some optimizations with printing. Current Sprintf sample is quite slow converting between data types.

mindsur commented 3 months ago

I've also implemented only small subset of extensions, primarily the ones that we use

phaag commented 3 months ago

HI, Many thanks for your code and your contribution. Anything which improves code and functionality is welcome. However, in the end the code must fit into the idea of the project. I am a bit reluctant to introduce old compatibility in new projects. To convert old 1.6.x files into 1.7.x files is implemented in nfdump by simply read/write those files. Therefore I would be interested, if this would be an option for you, to convert those files into the newer format with nfdump and then read and process them with go-nfdump. It has the advantage that they are fully converted, not just those fields, you need.

mindsur commented 3 months ago

Hi Peter, thanks for your feedback and additional insight into the project. I completely understand the desire to keep this project without the legacy functions.

Re-writing the files to new ones with new format is not something we would like to do, since we have a lot of data and we are building integration which requires faster processing and ingestion of it. But we are investigating a way to change the source of data to use newer format and hopefully we will be able to do that.

Once we have a way to read the data, our end goal is to have an integration of go-nfdump which can process the flows as they come in, generate different views/slices of them for different use cases and push it to appropriate database tables.

This will likely require features to read multiple files and aggregation of flows too. So hopefully I can contribute with some of that in the future. I hope it's fine to send you some occasional questions in terms of the vision you have and the functionality we are trying to implement and see if it is aligned or can be done in some better way.