Closed mieczkowski closed 1 year ago
Hi @mieczkowski , Thank you for the performance comparison. Could you also provide information about the setup (CPU/cores/arch, live traffic/simulator, pps)? After the changes, is there an explanation for having SFLOW_5 be +28%?
I may refactor a bit this PR, I think adding configuration options like workers is good but I don't want the program to take too many assumptions
Could you also provide information about the setup (CPU/cores/arch, live traffic/simulator, pps)?
Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, linux debian, 8 cores. Real flows from some production environments. 20-25k flows per second.
┌────────────datetime─┬─count()─┐
│ 2023-08-26 17:00:09 │ 21774 │
│ 2023-08-26 17:00:08 │ 24071 │
│ 2023-08-26 17:00:10 │ 20860 │
│ 2023-08-26 17:00:04 │ 21283 │
│ 2023-08-26 17:00:03 │ 19179 │
│ 2023-08-26 17:00:02 │ 20249 │
│ 2023-08-26 17:00:05 │ 20164 │
│ 2023-08-26 17:00:01 │ 20832 │
│ 2023-08-26 17:00:06 │ 20018 │
│ 2023-08-26 17:00:07 │ 19880 │
│ 2023-08-26 17:00:00 │ 22058 │
└─────────────────────┴─────────┘
After the changes, is there an explanation for having SFLOW_5 be +28%?
pmacct is aggregating flows, so for example if in one second there will be 5 flows with identical params, pmacct will group them and put 5
in flows
key, and sum bytes+packets. Goflow2 sends raw flows, so have more records in DB.
I may refactor a bit this PR
I have nothing against :) Maybe good idea would be some "dynamic workers" mechanism. If dispatch channel is growing - run more workers, and clean them if traffic is falling.
Maybe good idea would be some "dynamic workers" mechanism. If dispatch channel is growing - run more workers, and clean them if traffic is falling.
It's a good idea but I'm not sure if the added complexity is worth it: flows will have spikes periodically (unless ddos) and I don't think idle workers consume that much? since this is only a single "task" (decoding flows) and it will have to compete against other applications without being able to kick them out. I would rather suggest fixed capacity allocation. Overall I think it's more for platforms like Kubernetes to guarantee a minimum resource, autoscale or reschedule elsewhere.
I would also suggest moving initialization of metrics to separate package. When I wanted to add metric for dropping packets (in udp.go) there was a import cycle (metrics import utils).
moving initialization of metrics to separate package
there is https://github.com/netsampler/goflow2/blob/main/metrics/metrics.go but I designed in such a way it should be callbacks/wrappers called from the main app
But there is an import of utils package: https://github.com/netsampler/goflow2/blob/main/metrics/decoder.go#L9 So you can't use metrics in utils pkg (where should be counter for dropped packets)
So you can't use metrics in utils pkg (where should be counter for dropped packets)
Correct, I wanted to separate metrics from processing/decoding: for instance, if someone imports the package, it won't impose Prometheus metrics.
The GoFlow2 app has decodeFunc
callback which is wrapped by metrics.
But I agree there is a need to count the number of dropped packets.
I was testing flows collecting from pmacct and goflow2 and realized that goflow2 is loosing packets (goflow2 should have MORE data, because pmacct is aggregating in memory and flush aggregated)
After debugging I realized that when we have
1 socket = 1 worker
-> with high volume of flows per second reading packet will be faster than processing (and dispatch channel will be growing), so I propose:udp.go
increase default workers count tocfg.Sockets * 2
After changes: