IETF-OPSAWG-WG / draft-ietf-opsawg-pcap

PCAP next generation file format specification
Other
263 stars 59 forks source link

Q: The FCS information in the PCAP file header in draft-ietf-opsawg-pcap-00 #114

Closed noriov closed 2 years ago

noriov commented 2 years ago

Hello,

I'm new to PCAP and PCAPNG. I'm writing a small routine to save packets in the PCAP file format. I have a question regarding to the FCS information in the PCAP file header described in draft-ietf-opsawg-pcap-00.

According to draft-ietf-opsawg-pcap-00, the FCS information is encoded as following.

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    20 | FCS |f|0 0 0 0 0 0 0 0 0 0 0 0|         LinkType              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

But, according to comments in wireshark/wiretap/libpcap.c (L.362-399), the same field is encoded as following.

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    20 |  FCS  |0|f|       Class       |         LinkType              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The macros and comments in libpcap/pcap/pcap.h (L.217 - 225) and libpcap/pcap/dlt.h (L.1593 - 1610) also assumes the same format.

I guess the authors and the working group decided not to describe the 10-bit class field as some other fields are described as "reserved". So, my question is about the format of the FCS information.

If my understanding above is correct, there are two formats to encode the FCS information (i.e., 3-bit + 1-bit or 4-bit + 1-bit). Is the format being changed? If so, is there a method to distinguish the difference? Is it possible to just ignore the existence of the FCS information?

Thank you.

p.s. References: (as of March 4, 2022) https://github.com/wireshark/wireshark/blob/master/wiretap/libpcap.c#L362 https://github.com/the-tcpdump-group/libpcap/blob/master/pcap/pcap.h#L217 https://github.com/the-tcpdump-group/libpcap/blob/master/pcap/dlt.h#L1593 https://www.ietf.org/archive/id/draft-ietf-opsawg-pcap-00.txt

guyharris commented 2 years ago

I'm new to PCAP and PCAPNG. I'm writing a small routine to save packets in the PCAP file format. I have a question regarding to the FCS information in the PCAP file header described in draft-ietf-opsawg-pcap-00.

According to draft-ietf-opsawg-pcap-00, the FCS information is encoded as following.

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    20 | FCS |f|0 0 0 0 0 0 0 0 0 0 0 0|         LinkType              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

That is incorrect. I have just checked in a change to the spec to reflect reality.

But, according to comments in wireshark/wiretap/libpcap.c (L.362-399), the same field is encoded as following.

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    20 |  FCS  |0|f|       Class       |         LinkType              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The macros and comments in libpcap/pcap/pcap.h (L.217 - 225) and libpcap/pcap/dlt.h (L.1593 - 1610) also assumes the same format.

That is correct, except that the Class field no longer applies. It was put in there for the benefit of something NetBSD was doing, where they had at least one link-layer type corresponding to "raw" link types, with the Class value being 0x224 and, if that's the Class value, the LinkType being a NetBSD AF_ value.

NetBSD appear to have dropped that notion, and pcapng has only a 16-bit LinkType field and no provision for those "raw" link types. (If an option were added to the pcapng IDB to indicate a Class value, old code that doesn't know about the option will neither correctly interpret the LinkType value based on the Class value nor reject the file as having an unknown link-layer type; adding support for it would have to involve something such as an "IDB for non-standard LinkType values", so older software can at least reject it rather than misdissecting it.)

So...

I guess the authors and the working group decided not to describe the 10-bit class field as some other fields are described as "reserved".

...I just marked that as reserved; it will probably remain reserved forever. (Or, at least, until 2038, at which point pcap can't handle current time stamps.)

So, my question is about the format of the FCS information.

If my understanding above is correct, there are two formats to encode the FCS information (i.e., 3-bit + 1-bit or 4-bit + 1-bit).

There's a format in a document that I constructed and misdescribed the FCS in, and there's a format that code actually uses, and that I updated the document on the GitHub site to describe.

The 3-bit + 1-bit is the first of those; the 4-bit + 1-bit is the one that code actually uses.

Ignore the first of those, and follow the second of those.

Is the format being changed?

No, the document is being fixed. The page at https://pcapng.github.io/pcapng/draft-ietf-opsawg-pcap.html now reflects the fix.

If so, is there a method to distinguish the difference?

No, but I know of no cases where the incorrect format is used, and the libpcap code for DAG cards uses the correct format (and at some point I'll probably check in a change to Wireshark to use the correct format - currently, Wireshark doesn't look at the FCS stuff in that field at all), so any code that uses the incorrect format will have to be fixed.

Is it possible to just ignore the existence of the FCS information?

As a reader, or as a writer?

As a writer, the correct way to ignore its existence is to make the upper 16 bits of that field zero; that means that the presence bit is 0, and readers will assume the FCS information isn't available. That's also the correct thing to do if you truly don't know whether the packets have an FCS or not, or if, for example, incoming packets have them but outgoing packets don't. (For a case such as that, you'd need to use pcapng - or, as with pcap, just rely on the dissector to try to guess whether there's an FCS or not, as Wireshark does.)

guyharris commented 2 years ago

Note also that the draft you mentioned has a worse problem. See #106, which was filed in response to some private email, the first message of which said:

Hi Guy and Michael,

thanks for writing this draft, it will be good to finally have an RFC for this format. A couple of comments below.

I think bytes 20 through 24 of the File Header (Fig 1) are incorrect. Perhaps you meant to have the LinkType field come first, at least for a little-endian machine? Example:

# hexdump of little endian PCAP using Figure 1 formatting:
d4 c3 b2 a1
02 00 04 00
00 00 00 00
00 00 00 00
ff ff 00 00
00 00 01 00 

# hexdump of little endian PCAP using https://wiki.wireshark.org/Development/LibpcapFileFormat formatting:
d4 c3 b2 a1
02 00 04 00
00 00 00 00
00 00 00 00
ff ff 00 00
01 00 00 00

and my response was:

thanks for writing this draft, it will be good to finally have an RFC for this format. A couple of comments below.

I think bytes 20 through 24 of the File Header (Fig 1) are incorrect. Perhaps you meant to have the LinkType field come first, at least for a little-endian machine?

The upper 32 bits of what was originally a 32-bit link-layer type field were stolen for FCS information (and, originally, other purposes that NetBSD had, but it no longer seems to use them for that purpose).

So if you view that field as a 32-bit value, with the bit on the right being the least-significant bit, and the bit on the left being the most-significant bit, that figure is correct.

I.e., it's correct after the field is converted to host byte order. In the file, that's not the case.

It should probably be described as a 32-bit link-layer type plus additional information field, so people know that it has to be converted to host byte order as a 32-bit quantity, with the subfields described below that.

which Michael quoted in the issue.

I changed the document to fix that, but got it wrong, as you noticed. It's good that the first token of the document's name is "draft". :-)

noriov commented 2 years ago

Thank you very much for your kind reply.

I clearly understand the format of the FCS information. And, thank you for the change. I read your detailed description at https://github.com/pcapng/pcapng/commit/4c281b0f5e2fa6d1bc8ad40af6e62af904e793ff

I think it was a hard task to describe the specification of the PCAP file format, because it has a long history and many variants. Thank you very much for your incredible efforts.

Because I am new to the PCAP file format, what I can contribute to your work might be comments from newcomer's point of view. So, here are three comments.


(1) Abbreviation and expanded form

In the current version in Github (as of March 4, 2022), the abbreviation "FCS" appears without its expanded form "Frame Cyclic Sequence".

It would be helpful for readers to change the first use of "FCS" to "Frame Cyclic Sequence (FCS)".


(2) Supplementary figure

As you suggested in the second e-mail, endianness is often a source of careless bugs. To my shame, in an early version of my PCAP writer, I declared "LinkType and additional information (32 bits)" as two uint16 fields and got unexpected results. (I'm using X86_64).

As a non-native English user, I should carefully read figures, bit numbers, adjectives, prepositions, etc., to avoid misunderstanding. A couple of years ago, I wrote the following figure to understand a format correctly. This figure was a great help for me.

 * Segment Descriptor (64 bit)
 *
 * Format in LSB first (lower address first / lower bit first):
 *   0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
 *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *  |  Segment Limit (Bit 0 - 15)   |   Base Address (Bit 0 - 15)   |
 *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *  | Base(16 - 23) | Type  |Flags_L|L 16-19|Flags_H| Base(24 - 31) |
 *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *
 * Format in MSB first (higher address first / higher bit first):
 *   F E D C B A 9 8 7 6 5 4 3 2 1 0 F E D C B A 9 8 7 6 5 4 3 2 1 0
 *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *  | Base(31 - 24) |Flags_H|L 19-16|Flags_L| Type  | Base(23 - 16) |
 *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *  |   Base Address (Bit 15 - 0)   |  Segment Limit (Bit 15 - 0)   |
 *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

(ref. https://en.wikipedia.org/wiki/Segment_descriptor )

So, I think a supplementary figure like following might be a good help for non-native English speakers.

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  FCS  |0|f|0 0 0 0 0 0 0 0 0 0|         LinkType              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

(3) No padding

To my another shame, in an early version of my PCAP writer, I mistakenly put padding to a 32-bit boundary, and got unexpected results. The reason of my mistake was that my memory confused the PCAP packet record format with the PCAPNG EPB format. Padding alignment is often an another source of careless bugs.

It may be a help for careless readers like me to add a note saying "no padding" to the description of the Packet Data field in the packet record section.


Thank you.

guyharris commented 2 years ago

(1) Abbreviation and expanded form

In the current version in Github (as of March 4, 2022), the abbreviation "FCS" appears without its expanded form "Frame Cyclic Sequence".

It would be helpful for readers to change the first use of "FCS" to "Frame Cyclic Sequence (FCS)".

Frame Check Sequence, at least according to IEEE 802.3-2018, not Frame Cyclic Sequence (most Frame Check Sequences happen to be Cyclic Redundancy Check values, but that's not an absolute requirement for arbitrary FCSes, it's just how various protocols specify that an FCS is computed).

I've added that in 19119c9d2d0b6441045fe52fbd3ce2b01c9e49dd.

(2) Supplementary figure

As you suggested in the second e-mail, endianness is often a source of careless bugs. To my shame, in an early version of my PCAP writer, I declared "LinkType and additional information (32 bits)" as two uint16 fields and got unexpected results. (I'm using X86_64).

As a non-native English user, I should carefully read figures, bit numbers, adjectives, prepositions, etc., to avoid misunderstanding. A couple of years ago, I wrote the following figure to understand a format correctly. This figure was a great help for me.

 * Segment Descriptor (64 bit)
 *
 * Format in LSB first (lower address first / lower bit first):
 *     0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
 *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *    |  Segment Limit (Bit 0 - 15)   |   Base Address (Bit 0 - 15)   |
 *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *    | Base(16 - 23) | Type  |Flags_L|L 16-19|Flags_H| Base(24 - 31) |
 *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Thanks for cutting the field up like that, Intel! :-)

 * Format in MSB first (higher address first / higher bit first):
 *     F E D C B A 9 8 7 6 5 4 3 2 1 0 F E D C B A 9 8 7 6 5 4 3 2 1 0
 *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *    | Base(31 - 24) |Flags_H|L 19-16|Flags_L| Type  | Base(23 - 16) |
 *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 *    |   Base Address (Bit 15 - 0)   |  Segment Limit (Bit 15 - 0)   |
 *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

(ref. https://en.wikipedia.org/wiki/Segment_descriptor )

So, I think a supplementary figure like following might be a good help for non-native English speakers.

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  FCS  |0|f|0 0 0 0 0 0 0 0 0 0|         LinkType              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

I've done that in 3b121bcfe57918ef66202a7a5cd2b340d5433426; with the version of RFC markdown and XML we're using, we can't put that in the bulleted list of fields in the file header and, if we replaced the "LinkType and additional information" with that figure, we'd be back to where we were, when people could interpret the last 4 bytes of the file header as two 16-bit fields, so it took a bit of additional surgery, done in several commits.

(The process of doing make on my machine required me to update some tools, and then failed with the same problem that the GitHub Actions for this repository were getting and that prevented the formatted versions of the drafts from getting updated. I filed an issue cabo/kramdown-rfc#163 on kramdown-rfc, one of the tools I updated, and the maintainer filed pull request #112 to work around the kramdown-rfc issue and to remove some unneeded empty strings in the Markdown; the issue was that we needed to stop doing

   author:
   - ins: ''
     name: ''
     org: The Tcpdump Group

and, instead, do

   author:
   - org: The Tcpdump Group

.That, plus 914c61122df2fbee7dc502e66652f164bae30293, fixed the two problems that were causing the GitHub Action to fail, so the formatted versions of the draft are now getting updated again.

The problem with kramdown-rfc is that it crashes inside Ruby rather than telling the user what to fix in YAML in their Markdown document. They fixed that, making kramdown-rfc print a warning rather crashing, in what will presumably be kramdown-rfc-1.6.4. See cabo/kramdown-rfc@548db413004a0ce73da9afda77670c4da65cd1bc.

The CircleCI builds are still failing, probably because, when they attempt to push the documents to GitHub Pages, they run a Git command that includes a token that doesn't work, for some reason, so Git asks for a password but never gets it, so it eventually times out.

I've filed an issue, #113, asking whether we need the CircleCI testing for any reason.)

(3) No padding

To my another shame, in an early version of my PCAP writer, I mistakenly put padding to a 32-bit boundary, and got unexpected results. The reason of my mistake was that my memory confused the PCAP packet record format with the PCAPNG EPB format. Padding alignment is often an another source of careless bugs.

It may be a help for careless readers like me to add a note saying "no padding" to the description of the Packet Data field in the packet record section.

Done in 98a1e0567c24c396d08dc465a66f920c57e0c48e, with additional work in 776d05df20ed1cc15915fc6f37015a717e8fc693.

noriov commented 2 years ago

Thank you very much for all the changes you made to draft-ietf-opsawg-pcap, wireshark and libpcap.