securitytxt / securitytxt.org

Static website for security.txt.
https://securitytxt.org
MIT License
65 stars 35 forks source link

security.txt not well-formatted #94

Open eikendev opened 1 year ago

eikendev commented 1 year ago

Hi everyone,

When I ran sectxt against securitytxt.org, I noticed that the security.txt file is not validated successfully.

$ curl -LSs https://raw.githubusercontent.com/securitytxt/securitytxt.org/master/.well-known/security.txt | hexyl
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 2d 2d 2d 2d 2d 42 45 47 ┊ 49 4e 20 50 47 50 20 53 │-----BEG┊IN PGP S│
│00000010│ 49 47 4e 45 44 20 4d 45 ┊ 53 53 41 47 45 2d 2d 2d │IGNED ME┊SSAGE---│
│00000020│ 2d 2d 0a 48 61 73 68 3a ┊ 20 53 48 41 35 31 32 0a │--_Hash:┊ SHA512_│
[...]

Here you see that the first line ends with \n; but RFC 9116 specifies the cleartext header lines to end in \r\n:

cleartext-header =  %s"-----BEGIN PGP SIGNED MESSAGE-----" CRLF
[...]
CRLF             =  CR LF

My guess is that this happens due to Git, which normalizes newlines. You can ask Git to treat the file as binary using a .gitattributes file.

Please note: this is also true for other lines of the cleartext message, except for the actual cleartext body.

oh2fih commented 10 months ago

Is this really a problem? RFC 9116, 2.2 allows both CR LF / LF. It is problematic that the ABNF Grammar on section 4 is in contradiction with that. Does that requirement come from elsewhere?

As I see, OpenPGP Message Format (RFC 4880) canonicalizes the signed text documents by converting LF to CR LF before signing (RFC 4880, 5.4.2), but then they are converted back when storing the resulting signed message (RFC 4880, 5.9):

A one-octet field that describes how the data is formatted. - - If it is a t (0x74), then it contains text data, and thus may need line ends converted to local form, or other text-mode changes.

The remainder of the packet is literal data.

Text data is stored with <CR><LF> text endings (i.e., network-normal line endings). These should be converted to native line endings by the receiving software.

This is probably rather a bug in the sectxt that is too strict regarding the ABNF Grammar, which is caused by a mistake in the ABNF Grammar definition. If so, I would like to report errata to the RFC 9116.

oh2fih commented 10 months ago

My securitytxt-signer.sh security.txt formatter & PGP signer converts all the possible CRLF to LF for other reasons. I have added the same findings in the Rationale behind validation decisions section.

oh2fih commented 10 months ago

Both https://securitytxt.org/.well-known/security.txt & the file on the repository have LF newlines.

Here's some tests to confirm both LF & CRLF are equally suitable line separators for security.txt as section 2.2 states.

Original LF line feeds:

$ curl -O https://securitytxt.org/.well-known/security.txt

$ sha256sum security.txt 
7902b537fcca4f617b6a8065a55a5e5c5265c994b4a3ef721ddddf31c6ee680a  security.txt

$ xxd security.txt | head -n 3
00000000: 2d2d 2d2d 2d42 4547 494e 2050 4750 2053  -----BEGIN PGP S
00000010: 4947 4e45 4420 4d45 5353 4147 452d 2d2d  IGNED MESSAGE---
00000020: 2d2d 0a48 6173 683a 2053 4841 3531 320a  --.Hash: SHA512.

$ gpg --verify security.txt
. . .
:signature packet: algo 22, keyid B7ACAF980A48DAA7
    version 4, created 1675268457, md5len 0, sigclass 0x01
    digest algo 10, begin of digest 00 0b
    hashed subpkt 33 len 21 (issuer fpr v4 AC3F6904768283545A5283ABB7ACAF980A48DAA7)
    hashed subpkt 2 len 4 (sig created 2023-02-01)
    subpkt 16 len 8 (issuer key ID B7ACAF980A48DAA7)
    data: [255 bits]
    data: [256 bits]
. . .
gpg: Good signature from "Ed Foudil <contact@edoverflow.com>" [expired]

Convert to CRLF:

$ unix2dos security.txt 
unix2dos: converting file security.txt to DOS format...

$ sha256sum security.txt 
ff2c496c34358097e8a727a03db8103c76e1523f4bb8bca1c532f87352ca9c36  security.txt

$ xxd security.txt | head -n 3
00000000: 2d2d 2d2d 2d42 4547 494e 2050 4750 2053  -----BEGIN PGP S
00000010: 4947 4e45 4420 4d45 5353 4147 452d 2d2d  IGNED MESSAGE---
00000020: 2d2d 0d0a 4861 7368 3a20 5348 4135 3132  --..Hash: SHA512

$ gpg --verify security.txt
. . .
:signature packet: algo 22, keyid B7ACAF980A48DAA7
    version 4, created 1675268457, md5len 0, sigclass 0x01
    digest algo 10, begin of digest 00 0b
    hashed subpkt 33 len 21 (issuer fpr v4 AC3F6904768283545A5283ABB7ACAF980A48DAA7)
    hashed subpkt 2 len 4 (sig created 2023-02-01)
    subpkt 16 len 8 (issuer key ID B7ACAF980A48DAA7)
    data: [255 bits]
    data: [256 bits]
. . .
gpg: Good signature from "Ed Foudil <contact@edoverflow.com>" [expired]

Back to original:

$ dos2unix security.txt 
dos2unix: converting file security.txt to Unix format...

$ sha256sum security.txt 
7902b537fcca4f617b6a8065a55a5e5c5265c994b4a3ef721ddddf31c6ee680a  security.txt

This confirms the error is in the ABNF grammar rather than with the file.

oh2fih commented 10 months ago

I was about to report the errata to RFC 9116, but it got more interesting as the ABNF Grammar also states:

The file format of the "security.txt" file MUST be plain text (MIME type "text/plain") as defined in Section 4.1.3 of [RFC2046] and MUST be encoded using UTF-8 [RFC3629] in Net-Unicode form [RFC5198].

The requirement of CRLF comes from there:

Considering that, there is a dilemma:

Please share your ideas on which of these approaches would be better, @eikendev @EdOverflow @nightwatchcyber!

yakovsh commented 10 months ago

Even if RFCs 2048 and 5198 are referenced, this RFC can still override them somewhat which is what we tried to do with section 2.2. IMHO the best course of action would be an errata to the ABNF but something similar to this:

CRLF             =  [CR] LF
                      ; both CRLF and LF line separators can be used (see Section 2.2)
oh2fih commented 10 months ago

@yakovsh I like that idea as it would handle the issue keeping the errata report simple. The result would be less strict validation as intended in the section 2.2 that would be more practical, too. Redefining CRLF locally as [CR] LF does not seem elegant, but sure it is efficient. I can use the "notes" field for summarizing the rationale behind the suggested modification.

I will probably file the errata report later during the weekend. If confirmed, the report can be used to adjust the validators (like eikendev/sectxt) to comply with the intent of RFC 9116 rather than the strict ABNF grammar.

oh2fih commented 10 months ago

This has now been reported as Errata ID 7743.

   CRLF             =  [CR] LF
                         ; Both CRLF and LF line separators can be used
                         ; (see Section 2.2) as long as the entire file
                         ; uses the chosen line separator.

Notes:

RFC 9116 section 2.2 accepts both CRLF and LF line separators. There is a contradiction in the ABNF Grammar as it suggests only CRLF would be allowed elsewhere whereas LF is an option in "cleartext" & "eol". For consistency, the CRLF should either be mandatory or optional on the entire file, and only CRLF or LF should be used in a single file instead of mixing them.

The referenced RFC 2046 (section 4.1.1) and 5198 (section 2) have chosen the CRLF sequence as a MUST. On the other hand, the context is OpenPGP Message Format that canonicalizes the signed text documents by converting LF to CRLF before signing (RFC 4880, 5.4.2), and the receiving software should convert them to native line endings (RFC 4880, 5.9).

This report respects the intent of section 2.2 to treat line separators more liberally and recognizes that it is not an issue in the context of RFC 4880. The goal is to describe this in the ABNF Grammar with the smallest possible change, resulting in "CRLF" being locally redefined as "[CR] LF".