the-tcpdump-group / libpcap

the LIBpcap interface to various kernel packet capture mechanism
https://www.tcpdump.org/
Other
2.72k stars 855 forks source link

DAG: Add experimental packet transmit support. #1380

Closed infrastation closed 1 week ago

infrastation commented 1 month ago

Here is the first version after clean-ups. It works if used correctly, but still has a few rough edges. Also it makes it easier to see where the current libpcap API is not a perfect fit for a mix of Rx-only and Tx-only devices.

It is not clear if it would be better to reuse the Rx FCS environment variables for Tx, or to query the card for the current parameters to minimise user intervention, or maybe to justify introducing C functions into the API to query/set FCS settings on a pcap_t.

Likewise, in this revision libpcap returns an error rather late: from pcap_read() when trying to capture from a Tx-only stream and from pcap_inject() when trying to transmit into an Rx-only stream. It would be better to have feedback right from pcap_activate() (something such as PCAP_ERROR_SENDING_NOTSUP), which would in this case depend on the user declaring the mode of use before the activation by calling a new function such as pcap_set_device_mode(), which would change the default "capture and inject" to "capture only" or "inject only". Alternatively, there could be a function to query the device capabilities. This certainly requires more thought before implementing anything.

DAG API may be a convenient use case for testing new functions for multiple-packet transmission (Guy refers to Npcap pcap_sendqueue_*() functions as a potential model).

infrastation commented 4 weeks ago

On a related note, it could be useful to have pcap_set_snaplen()/pcap_snapshot() return PCAP_ERROR or even a more specific error in case libpcap does not support snapshot length setting/getting for a particular device.

guyharris commented 4 weeks ago

On a related note, it could be useful to have pcap_set_snaplen()/pcap_snapshot() return PCAP_ERROR or even a more specific error in case libpcap does not support snapshot length setting/getting for a particular device.

Haiku doesn't support it, but we implement that by just setting the caplen field in the header, so that if the snapshot length is set in order to reduce the size of a capture file being written, that works. That's what we should do in other cases where the capture mechanism itself doesn't slice packets; it doesn't save buffer space in the capture mechanism, or memory bandwidth used to put the data in the capture mechanism's buffer, but it'll still shorten the packet records.

infrastation commented 4 weeks ago

That's a good point. If packets get truncated only as a result of a BPF program, that's fine no matter if the filter runs in userspace, the kernel or the hardware. The problem then could be that the BPF program should have enough input to run correctly in as many cases as possible, in which case the hardware snaplen should be sufficient for the BPF program. Which is yet another factor for DAG.

sfd commented 4 weeks ago

PR looks reasonable to me, although I have not tested it myself.

infrastation commented 3 weeks ago

Found a bug in the padding size calculation, this will be replaced with the next revision after some testing.

infrastation commented 2 weeks ago

This revision gets ERF record padding right in dag_inject(), uses a custom alignment value for 9.2 cards, adds a couple missing ret = PCAP_ERROR; lines and addresses a compiler warning.

Short Ethernet frames work on 7.5G2, but not on 9.2X2 yet.

infrastation commented 2 weeks ago

This revision pads short Ethernet frames to the minimum required length. It also automatically detects the FCS length the card is expecting in outbound ERF records and appends a dummy FCS as required (README updated). The practical difference is that using 9.2X2 previously for icmp-rfc8335.pcap only 8 packets were transmitted successfully out of 10, now it is 10 out of 10; for lmp.pcap it improved from 16 packets out of 18 to 18 out if 18. However, for AoE_Linux.pcap around a third of the packets still fails to transmit, this requires more work.

Also some new variables got better names, some of the new code in dag_activate() is now a standalone function, dag_inject_bytes64() argument has been constified and pd->tx_align_bytes assignment now hinges on TX_ONLY().

Declare dag_read_notimpl() and dag_inject_notimpl() regardless of ENABLE_DAG_TX. Instead of testing for TX_ONLY() and RX_ONLY() in dag_read() and dag_inject() respectively test in dag_activate() and assign the appropriate function pointers once.

infrastation commented 2 weeks ago

This revision waits for the stream buffer to drain before stopping a Tx stream, which fixes the last known issue with this code. The ERF padding buffer became a static variable. pcap_activate() now fails to activate a Tx stream if the card's FCS size is set to 16 bit. There are assorted minor improvements to code style and comments.

Still a draft because the current revision requires an updated commit message. Also macOS CI failed due to CMake errors, which I have not yet figured.

guyharris commented 2 weeks ago

Also macOS CI failed due to CMake errors, which I have not yet figured.

CMake Error (dev) at CMakeLists.txt:2959 (add_custom_command):
  The following keywords are not supported when using
  add_custom_command(OUTPUT): SOURCE.

  Policy CMP0175 is not set: add_custom_command() rejects invalid arguments.
  Run "cmake --help-policy CMP0175" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
This error is for project developers. Use -Wno-error=dev to suppress it.

-- Parser generator: /usr/bin/bison
CMake Error (dev) at CMakeLists.txt:3034 (add_custom_command):
  The following keywords are not supported when using
  add_custom_command(OUTPUT): SOURCE.

  Policy CMP0175 is not set: add_custom_command() rejects invalid arguments.
  Run "cmake --help-policy CMP0175" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
This error is for project developers. Use -Wno-error=dev to suppress it.

The CMake documentation page for that policy says that policy was added in CMake 3.31.

Lines 2959 and afterwards, in this pull request's version of CMakeLists.txt, are:

add_custom_command(
    OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/scanner.c ${CMAKE_CURRENT_BINARY_DIR}/scanner.h
    SOURCE ${pcap_SOURCE_DIR}/scanner.l
    COMMAND ${LEX_EXECUTABLE} -P pcap_ --header-file=scanner.h --nounput -o${CMAKE_CURRENT_BINARY_DIR}/scanner.c ${pcap_SOURCE_DIR}/scanner.l
    DEPENDS ${pcap_SOURCE_DIR}/scanner.l
)

And, yes, they're using the SOURCE keyword.

But those lines aren't new; perhaps, for some reason, those builds were done with CMake 3.31 or later.

That line dates back to 6c8b3e4bf2272f8a6df43cb9332f6d89c5470f3f, in 2015. I'm not sure why I thought it needed to be there, but I'll see what happens if I remove it.

guyharris commented 2 weeks ago

Also macOS CI failed due to CMake errors, which I have not yet figured.

Rebase against main; that should fix it.

infrastation commented 2 weeks ago

Done. macOS passes now, thank you.

guyharris commented 2 weeks ago

Done. macOS passes now, thank you.

Thank you - and whoever upgraded the macOS builder to CMake 3.31 - for showing out that building with 3.31 was broken and that we needed to fix it.

infrastation commented 2 weeks ago

Looks ready to me, this is going to be merged in a few days unless something turns out broken.

In this revision pcap_stats() returns an error for Tx-only streams, more Tx-specific code stays out of dag_activate(), there are fewer tests for ENABLE_DAG_TX, dag_inject() uses a purpose-defined ETH_MAXLEN_NOFCS constant, has slightly better comments and no longer uses a helper function. CMake particulars are now correct in the README file.

A few things still look a bit off in the Rx-specific part of the module, but these are not new and can be improved separately.

infrastation commented 1 week ago

This revision uses more stub functions for Tx-only streams. This is going to be merged anytime soon.

guyharris commented 1 week ago

This revision uses more stub functions for Tx-only streams. This is going to be merged anytime soon.

Is, or is not?

infrastation commented 1 week ago

It is, as soon as Appveyor ticks it off. If you see reasons not to merge, please make it clear now.

guyharris commented 1 week ago

It is, as soon as Appveyor ticks it off. If you see reasons not to merge, please make it clear now.

No objections; it's just that, in English, "XXX will not happen anytime soon" is a common phrase, with the opposite bieng "XXX will happen sometime soon", i.e., in predicate calculus terms, it's either "for all times soon t, XXX will not happen at time t" or "there exists a time soon t such that XXX will happen at time t".

It appears that you meant the latter of those, i.e. "it will happen as soon as Appveyor finishes".

infrastation commented 1 week ago

Thank you for clarifying, perhaps I copied the phrase from a wrong source, or maybe different sources word it differently. This is merged now.