wmo-im / GRIB2

GRIB2
MIT License
22 stars 9 forks source link

register GRIB MIME type with IANA #175

Open tomkralidis opened 1 year ago

tomkralidis commented 1 year ago

Details

As discussed at WMO WIS2 meetings earlier this week, various implementations have used application/grib (see OGC 16-060r1 or application/x-grib as a media type when providing GRIB2 responses via APIs. For interoperability and consistency, a media type for GRIB (application/grib) should be officially registered with IANA.

Note that the OGC may be able to help with this process.

cc @erget @6a6d74 @chris-little @solson-nws

Requestor

Tom Kralidis (MSC), @tomkralidis

tomkralidis commented 1 year ago

FYI similar issue for BUFR at https://github.com/wmo-im/BUFR4/issues/140

manfredsc commented 1 year ago

Hi Tom, this is needed, thanks for your efforts. Note that this OGC document (html version at https://docs.ogc.org/is/16-060r2/16-060r2.html) talks about a future application/wmo-grib2, and to use application/x-grib2 meanwhile. In the Apache web server, both application/x-grib and application/x-grib2 are defined (see e.g. https://tika.apache.org/1.28.5/formats.html). So I rather see application/x-grib2 used nowadays for dealing with the GRIB2 file format.

I think it would make sense to define 2 distinct mime-types for GRIB1 and GRIB2. Mime-types are used to assign applications to some file types. There are several applications which can only deal with either GRIB1 or GRIB2, but not both. And GRIB1 seems still to be used at some weather agencies, unfortunately, and archived weather and model data will stay in GRIB1 format for a long time.

tomkralidis commented 1 year ago

Thanks @manfredsc. Note that application/x-grib2 is defined Apache Tika (vs. generic Apache per se). Keeping with application/grib would keep us future proof from re-visiting IANA for future GRIB versions?

manfredsc commented 1 year ago

You are correct, using a single application/grib would be more future-proof. But in my opinion it somewhat defeats the purpose of mime types. Mileage may vary. But any standardization with well-defined semantics is definitely good.

sebvi commented 1 year ago

Sometimes files can contain a mixture of GRIB1 and GRIB2 and maybe one day GRIB3 si, in my opinion, we should not add the edition in the MIME type... I would keep the current application/x-grib or if we want WMO to be mentioned, as suggestedcin the OGC document mentioned above, then application/wmo-grib.

Note that we already use application/x-grib at ECMWF.

manfredsc commented 1 year ago

What is the state of this issue? Has some decision being made? The thing is, some weeks ago I asked the maintainer of the unix "file" command to support the mime types x-grib and x-grib2. This may be not correct in the light of this issue, and would probably need come correction.

manfredsc commented 1 year ago

@sebvi: As far as I understand mime-types with "x-" are ad-hoc types, IANA registered types do not have a "x-" prefix, I think. Yes, I know the ECMWF abomination of mixing GRIB1 and GRIB2 types in one file, this gives all sorts of troubles for third-party software. At least for me, this is not an argument for anything.

tomkralidis commented 1 year ago

If we want to register with IANA, then this implies removal of the x- prefix. We should not have versions as part of a MIME type, so application/wmo-grib or application/grib (versions can be realized using the notation application/grib;version=3, say). AFAIK anything after the ; would be left to our own devices (so application/grib is what would be registered with IANA).

chris-little commented 1 year ago

@sebvi @amilan17 @manfredsc I agree with @tomkralidis that version should NOT be part of a MIME type. Also, all the versions of GRIB, (including GRIB0, but I'm probably the only person left remembering that!) include just the characters "GRIB" at the start of the file, and the version is a separate field an octet or two further along. So good GRIB software ought to be able to determine the actual type and perhaps read the version number before gracefully failing! And let us be precise: the MIME type, now more correctly known as Media type, is application and we are discussing the proposed sub-type of grib. For example, the provisional registration of Application/NetCDF does not specify the version.

@tomkralidis Personally, I think the subtype should be grib, not wmo-grib: shorter, reflects actual usage, and we would like non-WMO people to use it! wmo would be part of the IANA registry reference to the authoritative WMO definition.

chris-little commented 1 year ago

@tomkralidis Can you think of any use for the application/grib+something syntax?

tomkralidis commented 1 year ago

@chris-little agree for application/grib (and not application/x-grib). I can't immediately tbnik of any uses for application/grib+something. I guess anything +something could be a balance between identifying in a media type vs. decoding the data.

tomkralidis commented 9 months ago

Update: registration process with IANA can be found at https://www.iana.org/form/media-types

amilan17 commented 8 months ago

https://github.com/wmo-im/CCT/wiki/20.to.22.September.2023 notes:

The team discussed the following options with their considerations and noted that IANA MIME types are outside their area of expertise and prefer that other experts make a decision on the best approach.

1. application/octet-stream – too generic but good catch-all if BUFR/GRIB are both available from the same URL
2. application/grib – ok but requires opening the file to determine the edition
3. application/grib1 – not recommended, because it’s no longer endorsed by WMO
4. application/grib2 – ok doesn’t support mixed files
5. application/grib;edition=2 – ok, but the values of “;X=Y” are uncontrolled
6. application/x-grib2 – not recommended, what we do if we don’t register with IANA
7. application/wmo-binary or /wmo-tdcf – less generic than #1, can extend with “;format=GRIB2” in use already for use cases like UTF-8

chris-little commented 8 months ago

@amilan17 How about this priority order for discussion?

  1. application/grib – ok but requires opening the file to determine the edition [Same approach as NetCDF]
  2. application/grib;edition=2 – ok, but the values of “;X=Y” are uncontrolled [Addresses concern with 1, especially if optional.]
  3. application/octet-stream – too generic but good catch-all if BUFR/GRIB are both available from the same URL [not good web practice?]
  4. application/x-grib2 – not recommended, what we do if we don’t register with IANA [or application/x-grib]
  5. application/grib1 – not recommended, because it’s no longer endorsed by WMO
  6. application/grib2 – ok doesn’t support mixed files [future problem for GRIB3/4, existing GRIB1 from archives?]
  7. application/wmo-binary or /wmo-tdcf – less generic than 3, can extend with “;format=GRIB2” in use already for use cases like UTF-8 [Let's break out of the WMO silo]
sebvi commented 8 months ago

I am in favor of application/grib;edition=2 as I was during the meeting (and also because I proposed it :D)

yes the X=Y is uncontrolled but it gives extra informatio. If it can't be interpreted or is ignored, one can safely fall back to application/grib .

tomkralidis commented 8 months ago

I would leave version out of the actual media type? See https://www.iana.org/assignments/media-types/media-types.xhtml for more information. We (WMO) can use edition=x accordingly.

So:

ethanrd commented 8 months ago

Hi all - A few thoughts having started this process for netCDF. (Quotes and info below from RFC 6838 "Media Type Specifications and Registration Procedures".)

There are several different trees in the IANA registry name space: the standards tree; the vendor tree; the personal or vanity tree; and the unregistered tree.

The standards tree (Section 3.1) is "intended for types of general interest to the Internet community". Media types in this tree must either be associated with IETF specifications or be registered by a recognized standards-related organization. The process to have an organization recognized as a standards-related organization isn't very difficult. They were willing to recognize Unidata for netCDF with a few email exchanges to explain and justify. WMO, I expect, would be even easier.

The standards tree is where application/grib would have to be registered. Other trees have "faceted names", in other words they start with a prefix (for instance, application/x.grib or application/vnd.wmo.grib).

In the standards tree, parameter names and values are controlled (Section 4.3).

[T]he names, values, and meanings of any parameters MUST be fully specified when a media type is registered in the standards tree, and SHOULD be specified as completely as possible when media types are registered in the vendor or personal trees.

The current recommendation for unregistered media types (Section 3.4) is with a "x." prefix instead of an "x-" prefix. So instead of application/x-grib it should be application/x.grib.

amilan17 commented 4 weeks ago

Request was submitted on 30 May for: application/grib;edition=2