Closed seivadnomis closed 1 year ago
It's probably noteworthy that quoted-pair
is composed of "\" CHAR
in https://www.rfc-editor.org/rfc/rfc822 (p 10) with CHAR
being defined as
; ( Octal, Decimal.)
CHAR = <any ASCII character> ; ( 0-177, 0.-127.)
or 0x00-7E
whereas mt-qpair
restricts the ASCII values to 0x09-7E
. Assuming this is intentional, a note in the docs seems appropriate.
I believe the GEDCOM spec is wrong and should be fixed. Furthermore, the correct reference for media types has been, since 2013, RFC 6838 (the gedcom spec at least points to the IANA registry which points to that RFC) aka BCP 13, and that RFC does have ABNF. The "summarized" ABNF in the gedcom spec is wrong in comparison (e.g., allowing a media type to start with "#" etc.). In contrast the sections defining g7:FORM and g7:MIME correctly point to BCP 13. It's only section 2.10 that contradicts the later sections in the GEDCOM spec.
In my view, there is no reason to repeat ABNF from RFCs, we should instead just refer to it like we do for Language. That said, there are multiple definitions for how to compose media types and parameters into a single string (often called a Content-Type after the name of the header in HTTP and mail), which appear to vary by protocol so we should specify that. HTTP uses a more liberal definition with spaces permitted around the semicolon delimiters, and I would argue we should match that since HTTP use is prevalent and HTTP libraries may construct content type strings as such. See https://github.com/FamilySearch/GEDCOM/pull/251.
In section 2.10 Media Type,
mt-char
is defined to include the space character%x20
. But the corresponding production in RFC 2045, section 5.1 says:It seems unlikely that a space can be part of the name of a media type, subtype or attribute, as GEDCOM 7.0 appears to say.