wmo-im / et-acdm

WMO Expert Team on Atmospheric Composition Data Management (ET-ACDM)
https://wmo-im.github.io/et-acdm
MIT License
3 stars 2 forks source link

WIS2 topic hierarchy, structure #20

Closed markusfiebig closed 3 months ago

markusfiebig commented 1 year ago

This issue refers to this Wiki page: https://github.com/wmo-im/et-acdm/wiki/ET-ACDM-input-for-development-of-WIS2-topic-hierarchy

  1. In a hierarchical structure, the position of an item must be unambiguous. For gases there are cases which may be placed under greenhouse gases or reactive gases. Ozone receives a special role, even though it is a reactive gas. Question is whether the hierarchy should be based on WMO organisation history, or whether we try to clean up and use a structure guided by internal logic.
  2. Relevant bodies, e.g. CF, are now talking about "aerosol particles" instead of aerosol, finding a compromise between different communities.
tomkralidis commented 1 year ago

Target is a first pass for our next meeting in April 2023.

tomkralidis commented 1 year ago

ET-ACDM 2023-06-08:

cc @atverm @markusfiebig @gaochen-larc @ejwelton @sergimorenovalero @joergklausen (please tag others as needed).

joergklausen commented 1 year ago

ET-ACDM-6 discussed and elaborated further ....

NB: Definitions suggested in brackets added after the meeting.

"Our" level 8 to work out is "Atmospheric composition", with following sub-categories

tomkralidis commented 1 year ago

Should we consider putting forth a draft? Suggested next steps would be to create a topic-hierarchy structure in this repository as part of the proposal for our initial entry into the WIS2 topic hierarchy. Thoughts?

joergklausen commented 1 year ago

Should we consider putting forth a draft? Suggested next steps would be to create a topic-hierarchy structure in this repository as part of the proposal for our initial entry into the WIS2 topic hierarchy. Thoughts?

I agree, but I don't quite understand what you mean we should do in addition ...

tomkralidis commented 1 year ago

@joergklausen I've put forth a first pass in this branch for review based on the above.

ejwelton commented 1 year ago

I agree with the 3 topics observations, analysis-prediction, advisories-warnings.

Under observations, I feel strongly that aerosols and clouds sub-topic should be separated into separate sub-topics.

For observations I feel that our ET can eventually flesh this out on our own. However, we need input from experts outside our ET for analysis-prediction, advisories-warnings topics. For aerosol related sub-topics I would suggest first getting input from Sarah Basart WMO. I can also meet with her in Geneva in 2 weeks if she will be available (I will be there for another meeting).

gaochen-larc commented 10 months ago

What about water vapor? We can treat it as a meteorological variable, but it is also important to atmospheric chemistry.

markusfiebig commented 10 months ago

What about water vapor? We can treat it as a meteorological variable, but it is also important to atmospheric chemistry.

For ACTRIS, we have now agreed to define:

water vapour mass concentration, alt label : absolute humidity water vapour mass fraction, alt label: specific humidity water vapour liquid water saturation fraction, alt label: relative humidity with respect to water water vapour ice water saturation fraction, alt label: relative humidity with respect to ice

And we carry those both as meteorological and chemical variables. This is a very good example why a strict, unambiguous hierarchy will be difficult or impossible to achieve.

joergklausen commented 8 months ago

topic has been migrated to https://github.com/wmo-im/wis2-topic-hierarchy/tree/et-acdm-topics/topic-hierarchy/earth-system-discipline/atmospheric-composition, i.e., the results of the discussion should be maintained there. Comments to the topic hierarchy should still be entered here ...

sbasart-wmo commented 8 months ago

Hello Jörg,

This week I had a meeting with some WMO colleagues to try to understand how is organised the WIS2.0 system and provide some feedback on the proposal.

Important technical considerations to be aware of the WIS2.0 for the hierarchy design are the following:

Considering these points, I would suggest the following structure under atmospheric-composition

surface-based-observations: We need to take into consideration the operational workflow that follows the structure of the GAW Word Data Centers. I mean how the files are submitted in the Data Centers, because this is what I understand will be associated with the notification. Also before defining the categories here, we would need to identify what datasets are potentially available in NRT. @ejwelton, as far as I understand, the definition of the parameters is not directly connected with the WIS2.0 protocol. The WIS2.0 metadata file lists all the parameters included in the file. The WMO standards/conventions are another separate discussion.

space-based-products: At the moment, there is no example for weather-space, but I would say that here we should consider the different satellite-retrieved products, that are associated with different datasets. For example, aod > modis-retrieval, or no2 > omi-retrieval. But this is something that we need to double-check with the WMO satellite group.

analysis-predictions

advisories-warnings

markusfiebig commented 8 months ago

We should consider Sara's latest comment in this issue. Here, she says it is an absolute requirement to only have one point for a data stream in the discovery hierarchy. The fact that we only deal with RT data isn't really relevant. After all, we are defining a hierarchy for discovery purposes.

Having only one entry point per data stream means we need a 1:1 logic between data stream and hierarchy structure. To organize the hierarchy by topics cogently implies that there is a 1:n relation between data stream and hierarchy structure. Most of our data streams can be sorted under several topics. That leads to the logical conclusion that we cannot organize the hierarchy by topic, but need to use other concepts complying with the intrinsic logic of the data streams - for example variable matrix.

sbasart-wmo commented 8 months ago

We should consider Sara's latest comment in this issue. Here, she says it is an absolute requirement to only have one point for a data stream in the discovery hierarchy. The fact that we only deal with RT data isn't really relevant. After all, we are defining a hierarchy for discovery purposes.

Having only one entry point per data stream means we need a 1:1 logic between data stream and hierarchy structure. To organize the hierarchy by topics cogently implies that there is a 1:n relation between data stream and hierarchy structure. Most of our data streams can be sorted under several topics. That leads to the logical conclusion that we cannot organize the hierarchy by topic, but need to use other concepts complying with the intrinsic logic of the data streams - for example variable matrix.

@markusfiebig I've tried to include some examples, but I still need to figure out how to proceed with the surface-based observations. Who else is potentially using these observations? This is the main question we need to clarify to create the hierarchy. This is how to group the files/systems notifications to consider an application. Considering potential "daily" users, we can consider:

Now, the only example I have in mind on the GAW-NRT dataset is MPLNet (i.e., Judd), which are aerosol profiles. Then, you can consider it in a category called "aviation", but they can also be used by "modellers"; here is my main doubt. A clean and simple solution is to split it into three categories: surface, column-integrated and profiles. What do you think?

markusfiebig commented 8 months ago

@sbasart-wmo , your proposal would use the observation geometry as sorting criterion, which would be an option since it allows for an unambiguous distinction of observations.

gaochen-larc commented 8 months ago

what about aircraft-based in-situ measurements? should these measurements be classified in the surface category? There are also aircraft spiral profile measurements... Also, what about balloon sonde or dropsonde profile measurements? Many profile measurements would give both profile and column-integrated quantifies.

sbasart-wmo commented 7 months ago

@gaochen-larc If you check other examples you will see that aircraft_observations can be considered a different category than surface_based_observations. The same criteria that we can follow for balloon_based_observations

amilan17 commented 7 months ago

After all, we are defining a hierarchy for discovery purposes.

@markusfiebig - the WIS metadata record (WCMP2) is for discovery. The topic hierarchy is essentially a channel for receiving notifications after the dataset you want has been discovered. The channel has minimal meaning. it's kind of like a radio station channel, e.g. AM or FM with specific call numbers.

markusfiebig commented 7 months ago

@amilan17 - in the practical use cases, the topic hierarchy will essentially be an abbreviated version of discovery metadata. If it was "only a name", we might call the channels "Donald Duck" etc, which we obviously aren't doing. If the user needs to know the exact location of a product in the topic hierarchy in order to find it, we are creating a system that only a few core experts can use. We should avoid that.

tomkralidis commented 7 months ago

We need to ensure a balance of having a clear way to delineate topics that datasets can be published to for event driven workflows. At the same time, the WIS2 Topic Hierarchy is not a taxonomy or knowledge organization system per se, and that the key workflow is:

amilan17 commented 7 months ago

Sara's proposal summarized as topics:

  1. origin/a/wis2/{centre-id}/data/{core or recommended}/atmospheric-composition/surface-based-observations
  2. origin/a/wis2/{centre-id}/data/{core or recommended}/atmospheric-composition/space-based-products
  3. origin/a/wis2/{centre-id}/data/{core or recommended}/atmospheric-composition/predictions
  4. origin/a/wis2/{centre-id}/data/{core or recommended}/atmospheric-composition/advisories-warnings/sand-dust
  5. origin/a/wis2/{centre-id}/data/{core or recommended}/atmospheric-composition/advisories-warnings/air-pollution
  6. origin/a/wis2/{centre-id}/data/{core or recommended}/atmospheric-composition/advisories-warnings/wildfires

Noting, that we ONLY need topics for near/real-time data notifications.

sbasart-wmo commented 7 months ago

@amilan17 is analysis-predictions

amilan17 commented 7 months ago

For example, aod > modis-retrieval, or no2 > omi-retrieval. But this is something that we need to double-check with the WMO satellite group.

@sbasart-wmo - CGMS is proposing all operational satellites for weather and space-weather. I see no issue with repeating the same structure for relevant satellites under atmospheric-composition.

amilan17 commented 7 months ago

what about aircraft-based in-situ measurements?

@gaochen-larc we should consider these datatypes as surface-based-observations.

tomkralidis commented 3 months ago

As discussed at ET-ACDM 2024-08-13, the current proposal can be found in https://github.com/wmo-im/wis2-topic-hierarchy/tree/et-acdm-topics/topic-hierarchy/earth-system-discipline/atmospheric-composition

We have identified a next step of each data centre to provide the topic(s) they would publish to using the proposed hierarchy.

joergklausen commented 3 months ago

Issue closed with reference to #24 where a sort of 'implementation test' is open for contributions.