Open taylor-swanson opened 1 year ago
Pinging @elastic/security-external-integrations (Team:Security-External Integrations)
Pinging @elastic/sec-linux-platform (Team:Security-Linux Platform)
The changes from https://github.com/elastic/beats/pull/35453 would help this but the implementation is too complicated.
It would make more sense to have packetbeat have a config packetbeat.workers: 3
that creates several workers with the same fanout
group. Instead of the user having duplicated files.
I'm currently deploying Packetbeat using ansible
and this requires creating custom systemd files to be able to run multiple packetbeat instances which is very prone to errors.
I have a 10GbE interface and packetbeat data takes 10-30 minutes to even show up in ElasticSearch given how slow Packetbeat is with 1 instance.
Thoughts @andrewkroh @taylor-swanson
Packetbeat currently performs all reading and processing of packet data on one goroutine. This prevents Packetbeat from scaling on machines with more than one CPU core, at least in one process. There is at least another goroutine that handles reading off an event queue and submitting to an output.
Deciding how many goroutines to spin up, and which ones will do the work for which protocols, will be a challenge. There are features within AF_PACKET for load balancing, but Packetbeat does not yet support them. One such feature is PACKET_FANOUT_HASH, which will use a flow hash to route traffic between different sockets. That is something that could be explored as part of this change. Another option would be to broker incoming packet data to generic set of worker goroutines (all protos/ports have equal share), or perhaps each protocol or port gets a specific allotment.