[Improvement suggestion] High performance packet capturing in TRex

egwakim commented 2 years ago

Current performance of packet capturing is limited because it's done by remote JSON RPC processing. High performance packet capturing is quite useful to troubleshoot issues or tuning. There could be alternative ways to capture packets faster, Altertive#1. Add option to save .pcap file by TRex server's master_thread instead of sending JSON RPC to client Others : There could be other ways

Hi @hhaim, How about your opinion about it?

hhaim commented 2 years ago

@egwakim there is a feature to capture directly from server using ZMQ channel, have a look into the manual and example in the regression. There was an issue with watchdog (when closing the ZMQ channel that stuck the Rx thread but I think it was fixed)

hhaim commented 2 years ago

@egwakim see this https://trex-tgn.cisco.com/trex/doc/trex_stateless.html#_using_capture_port_for_faster_packet_capture_packet_injection

egwakim commented 2 years ago

Hi @hhaim It's for STL with service enabled, would it works in ASTF mode?

egwakim commented 2 years ago

Hi @hhaim Can we apply it in ASTF mode?

hhaim commented 2 years ago

@egwakim yes, the capture is a common code for interactive (STL and ASTF)

jsmoon commented 2 years ago

@hhaim "start_capture_port" won't capture traffic packets of DP cores. It only works with packets of RX core including redirected packets from DP cores. (e.g. latency packets) I think it was not designed for packet capturing.

hhaim commented 2 years ago

@jsmoon you are right, already discussed it in the other thread. I think the best option is to let the Rx core to pull the software ring and send it throw the zmq channel

jsmoon commented 2 years ago

@hhaim I checked the performance of the zmq channel in a simple implementation. The implementation is ...

based on the TrexCaptureMngr and TrexCapture feature in CRxCore. (by JSON RPC "capture")
added an "endpoint" parameter to establish the zmq channel.
used TrexPktBuffer as the software ring
TrexCapture pulls the packet in the TrexPktBuffer and writes to the zmq channel in the CRxCore::work_tick().

The result shows about 120Kpps. But I need about 1Mpps. To get the performance requirement, I think I need to use shared memory to deliver the matched packets.

The first option to use the shared memory is the capture client offering the shared memory as the "endpoint". The second option is adding a new capturing method like dpdk-dumpcap.

@hhaim which one do you prefer? or do you have any other ideas?

hhaim commented 2 years ago

@jsmoon I think it is better to use dpdk-dumppcap but we should add an API to get the pcap file/set the path etc API is the most important thing

jsmoon commented 2 years ago

@hhaim do you mean the use of dpdk-dumpcap binary itself for the capturing client? In my quick analysis of the dpdk-dumpcap source, T-Rex should call rte_pdump_init() to support it. Since librte_pdump uses rte_eth_rxtx_callback, I think we need to consider the order of the callback functions with the tunnel. For the API, I think the master daemon can be a good point to add.

hhaim commented 2 years ago

@jsmoon rte_eth_rxtx_callback is not a good interface beacuse of the tunnel support. Is it possible to call the capture explicitly?

It is better to have a trex API and not daemon API as the daemon is used only by our regression and it is not documented

jsmoon commented 2 years ago

@hhaim I think it is designed to use rte_eth_rxtx_callback. Since rte_eth_rxtx_callback is designed to add multiple callback functions, I think there is no issue if we set the order properly.

I have no idea how the trex API can use the dpdk-pcapdump and get the pcap file. What is your suggestion about it?

hhaim commented 2 years ago

@jsmoon I had in mind to keep the same set of API as today (capture/record/get) but add a way to change the mode to be "file", in file mode the get_buffer API won't work and the user will need to read the file using other method (like NFS) or rsync

to sum:

The API should be backward compatible - meaning the old way (slow) should still work
The new way should support only the required functionality (set the file name and the method)

jsmoon commented 2 years ago

@hhaim, I have prepared the diagrams to explain what will be changed.

This diagram shows the current capturing framework.

I changed the diagram like this. The RED color in the diagram explains the FILE saving concept.

"FILE endpoint" will be a file path in the T-Rex running host or just a "file" string. In case of no file path, it will be specified internally and the client can get it from the "stop" response message.
The client can check that all the captured packets are written from the "fetch" response "pending" message.
Then the client can copy it from the T-Rex running host in a general way. (e.g, rsync, scp, FTP, ...)
The "endpoint FILE path" will be removed when the client requests the "remove" capturing.

Optionally, I think I can improve the "capture monitor" performance (~100kpps) by adding a ZMQ endpoint.

If you agree with this change, I'll prepare a PR for the FILE endpoint at first.

hhaim commented 2 years ago

@jsmoon looks good.

cisco-system-traffic-generator / trex-core

[Improvement suggestion] High performance packet capturing in TRex #822