cloudflare / goflow

The high-scalability sFlow/NetFlow/IPFIX collector used internally at Cloudflare.
BSD 3-Clause "New" or "Revised" License
852 stars 171 forks source link

Losing flows from many network devices with 1 collector #27

Closed MIKNOTAURO closed 5 years ago

MIKNOTAURO commented 5 years ago

Hi Guys! I want to send flows from many routers to goflow collector and then push those flows to kafka topic, my question is as follow... Do you have any recommendation to know/calculate the relation between network devices (routers) and colletors. Now I have 3 routers target to 1 goflow collector and I have the impression that I'm losing packages. Do you have any advice/hint ? Btw, I'm running goflow through docker... should I need to run many containers? thanks for your time guys ;)

lspgn commented 5 years ago

Hi @MIKNOTAURO, Which version are you using? How many flows per router? NetFlow or sFlow? It really depends on the specs of the server. Are the containers limited in RAM and CPU?

Decoding takes around 25uS to decode NetFlow/IPFIX packets and 75uS to decode sFlow. At this rate, a GoFlow running on one core should be able to decode 10k-40k samples per second. But this does not take into account the time to push the data to Kafka: which can depend a lot on your setup.

Try running it without Kafka and check the metrics.

Regarding the drop, it may be a UDP buffer too small. I'd suggest to monitor the following:

# netstat -su
Udp:
    xxxx packets received
    xxxx packets to unknown port received
    xxxx packet receive errors
    xxxx receive buffer errors

should help.

Thank you