wmo-im / tt-nwpmd

2 stars 2 forks source link

Clarify whether data other than “core data” can be cached by Global Cache #10

Closed yhe-wmo closed 9 months ago

yhe-wmo commented 1 year ago

Need to clarify whether Global Cache can cache NWP data other than those identified as “core data”. The idea is whether NWP Centres can bundle core data with other data (e.g. open data or recommended data) together. Form the GDPFS perspective, there will also be such a requirement, as (1) some mandatory GDPFS products have not yet been classified as “core data”; and (2) some newly proposed core data may be initially labelled as "recommended data" in the Technical Regulations, until all RSMCs are ready to provide them operationally.

27 May 2023

DECISIONS

  1. Recommended data will NOT be cached by GC
  2. for core data, add properties.cache (true|false, default=true) to WNM as decided by data producer
  3. messages are republished anyway
  4. no issue in making "more" core data available (over and above Cg-Ext 2021/Resolution 1) (TBC)
Haddouch101 commented 1 year ago

Recommended data at the moment shouldn't be cached, and we are not sure if are we going to cache all core data because of data volume.

amilan17 commented 1 year ago

I think the question also pertains to a situation where there is a combination of core and recommended data in the same data file.

ThomasColemanBoM commented 1 year ago

Yes. If someone published a file and it contained all the data they generate, both core and recommended, what happens? @Haddouch101 if there are limitations about what can be cached, do centres need to ensure the cachable data is sent separately to other data or will the cache extract what it wants from the data provided?

sebvi commented 1 year ago

GRIB2 is a record based data format so in principle the global cache could easily extract the fields that are core from the files. This is easy to do if we agree on index files, those could contains offset and length of each record.

Haddouch101 commented 1 year ago

When someone published a file and it contained all the data they generate, both core and recommended, he needs to specify in the topic if it's core or recommended. If it's core data, according to the technical regulations it will be cached in the Global Cache, if not access will be through the WIS2 node (originating center). But as we discussed in ET-W2AT, because maybe of data volume issues, there was a proposal to add in the message something like "to be cached", in this case only data with this "to be cached" option will be cached in the GC

amilan17 commented 1 year ago

@Haddouch101 Thank you for your input. I'll try to provide some clarity (as I just discussed with Hassan in my office.)

The Unified Data Policy has an annex that lists the "minimum set of core data that Members shall exchange on a free and unrestricted basis" and then there are high-level descriptions of types of data that should be core and sometimes there are links to supporting manuals for further details.

Essentially, there is no finite list of "core" data to consider for approval or not. If the data can be exchanged freely without restriction than it can be identified as "core". If the notification message has "core", then WIS2 knows it can be exchanged on the cache. If it has "recommended" then WIS2 knows that it cannot be exchanged on the cache.

amilan17 commented 1 year ago

In other words, it's not possible to publish recommended data to the cache, because WIS2 enables data to be exchanged on a free and unrestricted basis. Recommended data applies to data with access constraints of some sort or another.

amilan17 commented 1 year ago

list of core data https://library.wmo.int/doc_num.php?explnum_id=11001#page=139

yhe-wmo commented 1 year ago

In terms of the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))), clarification was sought within the Secretariat (from Lars Peter) that “core data” of global analysis and prediction fields [see 1.2 (a) in Annex 1 of Res.1 (Cg-Ext(2021))] are only those specified in the Manual on the GDPFS. (see TT-NWPMD meeting on 17.04.2023)

TT-NWPMD meeting on 11.05.2023:

@Haddouch101

amilan17 commented 1 year ago

see notes in this issue: wmo-im/wis2-guide#7

amilan17 commented 1 year ago

4. no issue in making "more" core data available (over and above Cg-Ext 2021/Resolution 1) (TBC)

@echarpent @kpremec Do you have any advice or insight as to whether data not specifically designated as "core" in the data policy can still be labeled as "core" data?

6a6d74 commented 1 year ago

See also How to filter by spatial/temporal granularity to avoid unneccessary load on subscriber?

yhonda21 commented 1 year ago

See also How to filter by spatial/temporal granularity to avoid unneccessary load on subscriber?

This seems to be the correct link. How to filter by spatial/temporal granularity to avoid unneccessary load on subscriber?

yhonda21 commented 1 year ago

When a core data and a recommended data is combined in one fine and published, only one notification with either 'core' or 'recommended' data will be issued since the TH level 7 is about the data policy. Do I understand it correctly?

yhonda21 commented 12 months ago

Only core data should be stored in Global Cache. If the core and recommended data are archived in one file, it is not clear it can be stored in Global Cache. WIS archtecture doesn't recognize if the data are core or recommended. (see TT-NWPMD meeting on 05.07.2023)

yhonda21 commented 9 months ago

NWPMD meeting on 2023.09.14