mfontanini / libtins

High-level, multiplatform C++ network packet sniffing and crafting library.
http://libtins.github.io/
BSD 2-Clause "Simplified" License
1.89k stars 374 forks source link

What can be done for packet loss? #445

Closed jeffRTC closed 3 years ago

jeffRTC commented 3 years ago

@mfontanini

Thank you building this abstraction and it really helped me to put together a prototype, but I'm now running into packet loss when capturing packets. I notice a lot of times packets like TCP SYN are missing, but I can find TCP ACK PSH packets perfectly when queried.

I'm using following configuration when setting up capturing loop ,

    // Capture only incoming packets
    config.set_direction(PCAP_D_IN);

    // Capture only TCP packets that goes to 80Port
    config.set_filter("tcp port 80");

    // Capture packets faster
    config.set_immediate_mode(true);

The only thing I do at the callback is pushing the parsed packet into MongoDB Database and I don't think this causing packet loss because it's pretty fast.

mfontanini commented 3 years ago

Writing into MongoDB for every packet you see is going to create a giant bottleneck. Your database may be fast but you're talking about thousands of packets a second, it will definitely not be able to handle that scale, especially if you're doing a single write per packet.

You'll need to create a queue, push your packets in there and process them asynchronously. Your packet sniffing thread should be as fast as possible, doing as little blocking operations as you can. If you write them in batches into mongo (e.g. use bulk operations), it may be able to handle the load, depending on how complex your documents are. Ideally you would aggregate the few bits of data you want and not write a single document per packet as that's a ton of data.

e.g. something like this

ProcessCicle

jeffRTC commented 3 years ago

@mfontanini Thank you!

jeffRTC commented 2 years ago

@mfontanini I have one issue with periodically writing in batches because I want to query the packet data for IP real-time without extra delay. So, is there any problem with the consuming thread directly writing to MongoDB without batching?