phaag / nfdump

Netflow processing tools
Other
770 stars 202 forks source link

Filter not working with sflow, but works fine with netflow #551

Closed dojosout closed 1 month ago

dojosout commented 1 month ago

When running nfdump against captured netflow data, the filter works fine ('bytes > 10M' in this case). When running the same command against captured sflow data, everything is filtered out. I get a "No matching flows" error. For example, running against captured sflow:

nfdump -r . -a -n 10 -O bytes gives the following output:

Summary: total flows: 80023052, total bytes: 41.2 T, total packets: 40.0 G, avg bps: 4.9 G, avg pps: 595409, avg bpp: 1030 Time window: 2024-08-01 00:00:00 - 2024-08-01 18:39:59 Total flows processed: 80023052, passed: 80023052, Blocks skipped: 0, Bytes read: 13387580696 Sys: 2.5930s User: 23.9467s Wall: 17.9429s flows/second: 4459874.1 Runtime: 17.9977s

That's what I expect. Running nfdump -r . -a -n 10 -O bytes 'bytes > 10M', however, gives the following output:

No matching flows Summary: total flows: 0, total bytes: 0, total packets: 0, avg bps: 0, avg pps: 0, avg bpp: 0 Time window: 2024-08-01 00:00:00 - 2024-08-01 18:44:59 Total flows processed: 80414245, passed: 0, Blocks skipped: 0, Bytes read: 13452841612 Sys: 2.7417s User: 15.0918s Wall: 10.2533s flows/second: 7842735.0 Runtime: 10.2535s

If I do the same two commands against netflow data, I get the same output, which is expected since the largest flows are all > 10M.

My use case is that I use nfdump to filter out all flows smaller than 10MB since I'm not interested in small flows. This is working fine for netflow, but with sflow the filter is throwing out everything. No flows match. I've just recently updated my flow processing pipeline to use 1.7.4, so this is a new behavior I haven't seen before.

phaag commented 1 month ago

Hmm - the file format is the same. The filter cannot distinguish between netflow and sflow. Can you send me an sflow nfcapd.xx file by mail? Mail address is in the authors file.

phaag commented 1 month ago

The individual flows in the sflow file may be less than 10M, but aggregated -a more than 10M. As filtering takes place before aggregation, all those flow are stripped out. You would need a post-processing filter in order to do this. That's on the todo list.

As a workaround, you can work with a temporary flow file:

nfdump -r . -a -z=lz4 -w tempflows
nfdump -r tempflows -n 10 -O bytes 'bytes > 10M'

That should do the trick. You even see, how many flows have been aggregated for that.

phaag commented 1 month ago

Added poc to master repo for post filter. Use cli switch -P : -P 'bytes > 10M' with the standard filter syntax.

phaag commented 1 month ago

Done.