climsoft / Climsoft

Climsoft Desktop for Windows - http://www.climsoft.org
GNU General Public License v3.0
14 stars 28 forks source link

Element information #356

Open rdstern opened 6 years ago

rdstern commented 6 years ago

This is the Metadata > Element table and corresponding dialogue.

I have 4 questions regarding the elements.

  1. They are defined when CLIMSOFT is established, and each element has at Type. This could be Daily, Hourly, etc. From the initial script file there seem to be quite a number of types, including Daily, Hourly, Monthly Dekadal, Synop, AWS. But the type in the form just has 4 of these, Daily, Hourly, Monthly, AWS. I wonder why?
  2. Section 5.2.1 seems to indicate you can either add a new element, or redefine an existing one. I can understand adding new ones, not least because those of type AWS seem few. But I could redefine element 2 to be hourly, etc. Is this what is wanted? What are the limits to recognised CLIMSOFT "code of conduct".
  3. What I would like to be able to do (and we have for the Western Kenya data) is to be able to check the codes used for the elements and change them if necessary. After data are in the final table, suppose I find that element sunshine hours was coded as element 132, (which it was) but it was daily and hence should have been element 84. How would I handle that correction in the database?
Steve-Palmer commented 6 years ago

This Element naming has always been problematic. Firstly, using text "Daily" "Hourly" "Monthly" is not very helpful. Secondly, using the same field to add the qualifier "SYNOP", "AWS" is also not helpful. We got here because (in the dim past) CLICOM only gave a single field to hold all this information. The Climsoft metadata model has always had sufficient richness to hold the information properly, but use of these was not enforced in earlier implementations (and up to V3.2, there were no dialogues for these tables). I have not yet checked if the current version goes into sufficient detail for the metadata.

In https://library.wmo.int/pmb_ged/wmo_1131_en.pdf this is covered on pages 64 - 66. The basic concept is that a Station has a set of Sensors, and each Sensor will provide observations. The key text is: 4.3.1.7 Details of what meteorological variable is being observed by the sensor (i.e. the observed property), including: – Phenomena observed – Frequency of measurement – Frequency of acquisition – Units of measurement – Precision of measurement.

Each observation comes from a Sensor. Each Sensor measures an Element. I would argue strongly that the Element should only define the top-level phenomenon which the Sensor is observing - Rainfall, Air Temperature, Pressure etc, and that all the rest should be defined in the metadata. So if the user wants to only select observations which come from AWS or via SYNOP or METAR messages, then they do that by selecting the appropriate fields in the station metadata, not from the text of the Element code.

A Station consists of a co-located set of Sensors. There is no restriction on how many sensors there may be at a Station measuring any particular Element. So there may be several manual raingauges as well as raingauges which are part of an AWS. All are valid, and all ought to be available in the climate database - comparing the different sensors is a significant part of managing long-term consistency. The CIMO guide suggests periods of parallel running when the instument systems are changed.

The transmission system will also influence the observation element values stored, so this needs to be recorded. There should be business rules about what is stored. For example, a report of Temperature received via a METAR report will be in whole deg C. The temperature from the same sensor at the same time received via direct telemetry from the AWS will be in 1/10 deg C - hence it is reasonable to decide that the direct telemetry value will be stored and the METAR one will not, unless the METAR is the only available report.

The issue for accumulated values (Rainfall, max and min temperature, sunshine) is more subtle. Here one need to know the period of the accumulation. One can obtain a daily rainfall by (a) a manual raingauge read once a day, or (b) total of two 12-hourly observations (as is the norm in Africa in SYNOP reports), or (c) total of 24 hourly readings. All of these are valid - which one is given priority in the user data is a choice. In East Africa, daily climate data is usually defined as ending at 0600 UTC each day. Practically, I would rather know the exact time of the observation and the period it covers rather than just the nominal time - thoutgh for most historic data we do not have these exact times. This enables one to make use of accumulations over several days. For example, if a user wants monthly rainfall, and there is a weekend accumulation within the month, the total is still a pure measurent. It is only an estimate if the weekend accumulation spans the month end.

We should reduce the number of element codes to the bare minimum by removal of all reference to period or to transmission type. Can we start by deleting all the Element codes which include the text "AWS"?

I don't think we should make these major changes in Climsoft V4.1 or V4.2, but it would be good to know that the Metadata Manager role has dialogue to enable population of all the station element and observation element tables. In an ideal world, these should link to both Stores and Calibration records, so that the life history of each sensor can be tracked.

isedwards commented 5 years ago

I would argue strongly that the Element should only define the top-level phenomenon which the Sensor is observing - Rainfall, Air Temperature, Pressure etc, and that all the rest should be defined in the metadata.

We have discussed this in depth at the Kitale workshop and we can see how to move forward:

  1. Work has been undertaken to identify the group each element code belongs to (the groups correspond to the "top-level phenomenon" mentioned by @Steve-Palmer above).
  2. In the next version, users will select the top-level element, and then the appropriate related meta-data, e.g. the duration/period that the value corresponds to ('hourly', 'daily', etc.)
  3. We will maintain a table of legacy element codes. This will allow us to convert the the historic codes, e.g. element code PRMX15 will become Rainfall with periodValue 15 and periodUnit minutes.
isedwards commented 5 years ago

If we no longer distinguish between automatic weather stations and other instruments by using a variation in the element code then it will no longer be possible to determine which instrument was used to capture an observation when multiple instruments are in use at the same station recording the same element.

Currently, for each observation, it is possible to determine the instrument that would have been used by querying the stationelement table and seeing which instrumentId was in use between beginDate and endDate (at that stationId, for that elementId).

However, if multiple instruments are in use it will be no longer possible to tell.

stationelement
beginDate
endDate
+ recordedFrom (stationId)
+ describedBy (elementId)
+ recordedWith (instrumentId)
height

A possible solution is to add recordedWith (the instrumentId) to the observatinitial and observationfinal tables. But this would make the stationelement table mostly redundant.

Is this the only solution to the problem or are there other possible solutions?

observationfinal
obsDatetime
+ recordedFrom (stationId)
+ describedBy (elementId)
period
obsLevel

(recordedWith currently doesn't exist in the observation tables because it was possible to determine the instrument from the stationelement table)

NOTES:

isedwards commented 5 years ago

Currently element codes are NOT unique, e.g.:

SELECT * FROM obselement WHERE abbreviation='MXPRCP';

elementId abbreviation elementName ...
210 MXPRCP Precip Max in the Month
410 MXPRCP Precip Max Dly 10 Days