toml-lang / toml

Tom's Obvious, Minimal Language
https://toml.io
MIT License
19.55k stars 856 forks source link

Official MIME Type? #465

Closed hairyhenderson closed 6 years ago

hairyhenderson commented 7 years ago

There doesn't seem to be a registered MIME Type for TOML - are there any plans to register one?

From some quick google searching I see one use of text/x-toml, but I'd suggest application/toml as a more appropriate type.

If an application hasn't been started yet, here is the place to start: https://www.iana.org/form/media-types.

Thanks!

Hrxn commented 7 years ago

Yes, application/toml

+1, this would be along the lines of JSON.

ChristianSi commented 7 years ago

-1, if there is a MIME type, it should be text/toml. TOML, as opposed to JSON, is very much for files that are meant to be read and written by humans, such as config files.

hairyhenderson commented 7 years ago

I don't feel too strongly about this, but based on https://tools.ietf.org/html/rfc2046#section-3 I think application/toml is still more appropriate.

Re: text (from the above - section 3.1):

          ...
          Other subtypes are to be used for enriched text in
          forms where application software may enhance the
          appearance of the text, but such software must not be
          required in order to get the general idea of the
          content.  Possible subtypes of "text" thus include any
          word processor format that can be read without
          resorting to software that understands the format.

Based on that, I think subtypes of text is intended to be used more for unstructured (but possibly formatted) text, whereas application seems to be more appropriate for structured data, like TOML.

I'd argue that, while TOML emphasizes human-readability, it's still primarily a data format, intended to be read by applications. As a contrast, a markdown file (text/markdown) conveys its full meaning without being processed further by an application, whereas a TOML files are generally meaningless outside of the context of the application they are configuring.

lilydjwg commented 6 years ago

But we have text/{css,vcard,csv,html}. application/ sounds like it's not intended for people to read and write even it's text (like JSON or mbox). application/javascript is strange since many source types are text/ or text/x-*.

hairyhenderson commented 6 years ago

application/* sounds like it's not intended for people to read and write even it's text (like JSON or mbox).

@lilydjwg I agree that's how it sounds... But neither JSON and mbox are primarily intended to be human-read. That they're human-readable or not is incidental IMO.

As for TOML, I still maintain that it's meaningless outside of the context of an application that parses it. A file written in TOML must be processed by some application in order for it to gain meaning. In contrast (and, in theory but not always in practice), you could read a file written in text/html without missing any of the intended meaning.

I'd also contend that we shouldn't use the existing registered MIME types, especially older ones, as particularly good examples of RFC 2046 conformance 😉

hairyhenderson commented 6 years ago

Thanks @mojombo!

ChristianSi commented 6 years ago

The README now says: "the appropriate MIME type is application/toml."

However, this seems very bad advice, since https://www.iana.org/assignments/media-types/media-types.xhtml knows no such MIME type. Unless the MIME type has officially been registered (assuming that anyone even plans to do that), application/x-toml should be used (or something else with x- in it).

ChristianSi commented 6 years ago

I propose to re-open this and fix the README.

hairyhenderson commented 6 years ago

@ChristianSi the x- is discouraged as per https://tools.ietf.org/html/rfc6838#section-3.4

IMO it's totally reasonable to recommend application/toml before it's registered. Also IMO, it's up to @mojombo to initiate the registration process (it is, after all Tom's Own ... 😉).

Again, the link to the registration form is https://www.iana.org/form/media-types...

ChristianSi commented 6 years ago

@hairyhenderson I stand corrected. However, I also note that the same RFC section says: "with the simplified registration procedures described above for vendor and personal trees, it should rarely, if ever, be necessary to use unregistered types." Hence I take it that the RFC does NOT recommend using x-free MIME types without registration.

Also, assuming some Tim one day designs Tim's Original Multimedia Layout (TOML) and registers the appropriate MIME type before @mojombo comes around to do it. Then we would be in trouble....

Hence, forget about the x-, but swiftly registering the MIME type still seems a good idea.

pradyunsg commented 6 years ago

+1 to registering the MIME type asap.

hairyhenderson commented 6 years ago

the RFC does NOT recommend using x-free MIME types without registration.

Of course it doesn't - type registration is the subject of that RFC, after all 😉

but swiftly registering the MIME type still seems a good idea.

I agree! But AFAIK, there's only one person who can reasonably do that 🙂

Also, assuming some Tim one day designs Tim's Original Multimedia Layout (TOML) and registers the appropriate MIME type before @mojombo comes around to do it. Then we would be in trouble....

I think this is unlikely. TOML's a well-enough-known format by now... And encouraging common usage of application/toml is IMO a good way to prevent this from happening - prior art, and all that.

patcon commented 6 years ago

In case anyone wants to help drive the registration process: https://github.com/toml-lang/toml/issues/574 ❤️

DeadWisdom commented 4 years ago

@hairyhenderson -- You've got this wrong here. The point of the text/ media type vs application/ is toward displaying the file to the user. An application/* object should make no sense to a user, and so a system should not even attempt to present it except within the context of its application.

For instance, given an attachment in an email, and the user clicks on it, the client should not present it, unless it knows how to handle that type. A text/* object however, should be displayed as text even if the client doesn't know what to do with it.

The point for text/ isn't that it should represent long-textual data, but rather that it is likely to be fully readable by a human as opposed to application/, which is not. The former is the very point of TOML.

As for JSON, "application/json" is correct, because although the user can parse it, the purpose of JSON was always to be a limited data format, not necessarily an easily readable one. Hence the long disputed decision that it cannot have comments. Nor does it dictate any human readable whitespace / newlines.

Sorry to re-open a long dead, closed issue, and I don't mean to bikeshead, but whereas one might see this as trivial, I see a correct mime type as very important.

hairyhenderson commented 4 years ago

The point of the text/ media type vs application/ is toward displaying the file to the user

I think that over-simplifies things, but even with this simplification, application/toml still makes more sense than text/toml.

Again, from the MIME RFC:

Other subtypes are to be used for enriched text in forms where application software may enhance the appearance of the text, but such software must not be required in order to get the general idea of the content. Possible subtypes of "text" thus include any word processor format that can be read without resorting to software that understands the format.

The common usage of TOML is not primarily as an "enriched text" format. I wouldn't write a blog post in TOML, for example.

From this repo's README:

TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics.

TOML is specifically defined as a config format, and even though it may be easy to read, it's still impossible for a human to gain the intended value of a TOML file on its own. To actually gain the value of a TOML file, I'd need to pair it with some software to process it.

Unfortunately RFC 2046 is somewhat unclear, likely due to the rarity back in 1996 of human-readable non-binary configuration formats. Besides, the use of MIME types has long since escaped the realm of "Internet Mail", in ways that I'm sure the authors did not at all expect!

Ultimately, until someone goes through the process of registering with IANA, there won't be any official type, and any argument around which one to use will simply be bikeshedding 😉

I see a correct mime type as very important.

If so, please help out with #574!

patcon commented 4 years ago

Or both :) Ceci n'est pas une bikeshed

From inspecting the mimetype list:

https://www.iana.org/assignments/media-types/text/xml https://www.iana.org/assignments/media-types/application/xml

(*/rtf is too, and likely some others)

hairyhenderson commented 4 years ago

@patcon That's an interesting point, though XML is a bit "special" in that it can be used both to represent both formatted text and configuration/data intended to be processed by applications.

Comparing TOML to XML is a bit of an apples-to-oranges comparison. XML is a generalized Markup Language, and a file in XML format is completely devoid of semantics outside of the context of a schema. Mapping XML to a data structure requires a schema to do so unambiguously, whereas TOML can be mapped unambiguously without a schema.

I think it's probably more appropriate to compare TOML with JSON or YAML, of which only one has a registered media type: application/json. And YAML is explicitly a superset of JSON, which could be construed as a reason to treat application/yaml as the correct type for that format. Though both text/yaml and application/yaml appear in the wild (as well as many other variations).

eksortso commented 4 years ago

I'm late to this shindig, but here's my take. I've read convincing arguments for both application/toml and text/toml here, but I'm siding with application/toml because TOML is essentially a data format, and there's a lot of precedent to use application/* for data, even when humans can read it easily.

I scanned https://www.iana.org/assignments/media-types/media-types.xhtml and looked for prominent keywords.

The registered types certainly don't force usage, and they don't reflect what's used in the wild necessarily, but they do give clear intentions as to what is intended by the content type.

Maybe the dual approach suggested by @patcon would be best. But based on current usage and on previous types, application/toml ought to come first. Via #574 of course.

hairyhenderson commented 4 years ago

That's a good summary @eksortso, thanks for digging into it.

  • yaml doesn't show up at all. That is surprising.

Yeah - I don't know why... FWIW I just filed https://github.com/yaml/yaml-spec/issues/49 to suggest it 😉

DeadWisdom commented 4 years ago

I'm gonna shut up cause I'm clearly bikeshedding, then. Sorry about that.

But, I will leave this, conversation on XML's mimetype: https://mailarchive.ietf.org/arch/msg/xml-mime/jGvJ-bYob0oqV8W9SYjYF5vCy5o/

I am not sure how to help on #574 -- But I will try!

IS4Code commented 1 year ago

One argument against using text/ for the current version of TOML here is that text files commonly undergo line ending and encoding conversions when transmitted, and the default encoding is ASCII. Hence, TOML being restricted to UTF-8, the actual MIME type would be text/toml;charset=utf-8. Additionally, when transmitted to Mac using CR-style line endings, the file would not be parseable, since it does not treat a sole CR as a newline (could be fixed by defining it in terms of lines). Multiline strings would also change their value depending on the target line ending sequence.

It should be noted however that there are configuration formats in text/, contra @eksortso's observation, and relatively recent ones too ‒ text/turtle for example, being a general RDF graph storage format (see its encoding consideration for inspiration), could be used for configuration too, so TOML would still be in good company there (alongside text/n3, text/csv, text/dns, text/tab-separated-values, text/shaclc, text/vcard and so on).

I support using both text/ (for readability) and application/ (for safety of processing). I could even imagine myself using TOML for structured information presented only to humans (as opposed to human-unfriendly JSON), and there are far less human-readable formats in text/ anyway.

arp242 commented 1 year ago

when transmitted to Mac using CR-style line endings, the file would not be parseable, since it does not treat a sole CR as a newline

Only the old "Classic MacOS" used CR as line endings, as did some other old systems like the Commodore 64, but none of that is really relevant any more. Current OS-X/macOS uses a LF (\n) like other Unix and Unix-y systems.