Closed hagfelsh closed 1 year ago
I think you're thinking of the dwdump utility, not dwcap, but no matter, you have the purpose of the utility correct. What you're missing is the fact that dwdump didn't get added to the dropwatch package until release 1.5.2, so its certainly not going to be available in RHEL7's 1.4 package. Even if you build the latest dropwatch, the kernel support for using dwdump isn't going to be present in the RHEL7 kernel, so you're out of luck there, unless you want to write a very large check to IBM :)
Oh how about that lol thanks for the quick reply!
That prompts another question on the side; your tool is the standard for capturing dropped traffic in Linux. What else exists in the world that does anything like this? As you might have guessed, I'm trying to understand what's being dropped at the driver and I've not yet found any way to determine what it is.
At the driver level you're generally left with 2 choices: 1) a custom bpf or systemtap program you write to monitor specific code paths 2) some custom driver level interface debug tool
(2) isn't going to exist for any open source driver, but some proprietary drivers may have something for you
systemtap is usually a pretty good way to drill down on what you're looking for, but a better step 0 is to take a look at the data you have that is suggesting that you are dropping packets and brainstorm causes. What data do you have that is suggesting dropped packets in the driver?
Yikes I'm at the edge of the world!
The only thing I have to support it's the driver is that the drop increments are being reported in /sys/devices/
What driver?
140e
for the X710.
i40e grabs those stats from the hardware (mapping the software rx_dropped stat to its hardware rx_discards counter(s)). You can check the function i40e_stats_update_rx_discards to see how it works, it calls i40e_stat_update64, which pulls hardware stats fro the chip, and updates the software counter structures.
So you're kinda out of luck searching for a software drop in the driver, because there isn't any. The drops are occurring in the hardware prior to the driver ever receiving them. You can use ethtool -S to get more detailed stats, as the i40e driver I think breaks out drop stats to something a little more granular that might give you an idea of why this is happening. That said, usually the cause for something like this is an overrun - i.e. the data coming in on the hardware is getting hashed to a receive queue that the corresponding CPU can't keep up with, and so the hardware drosp frames because the CPU isn't draining the CPU fast enough. Sugest using ethtool to check queue lengths and hash destinations using the ntuple settings. It won't help you with a root cause, but you also might try enabling pause frames to prevent drops of this nature
This is marvelous advice, thank you so much!
As I understand it, dwcap captures packets that would or will be dropped so they can be examined retrospectively with wireshark.
Examining the dropwatch-1.4-9.el7.x86_64.rpm shows no such file, though the source rpm shows src/dwcap.c.
Am I misunderstanding how to find this particular tool?