wmo-im / BUFR4

BUFR edition 4
MIT License
27 stars 9 forks source link

New BUFR sequences for encoding IKFS2 products #10

Closed erget closed 4 years ago

erget commented 4 years ago

Branch

https://github.com/wmo-im/BUFR4/tree/issue-10

Summary and purpose

This document proposes new BUFR sequences and quality flags in order to encode radiances observed using the IKFS-2 instrument on board the Meteor-M N2 satellites.

Action proposed

The meeting is requested to approve the contents for inclusion within the next update to the WMO Manual on Codes.

Discussions

The Infrared Fourier Spectrometer - 2 (IKFS-2) is a passive cross-nadir infrared sounder for temperature and humidity sounding. It can also be used to observe the ozone profile and total column greenhouse gases. The IKFS-2 flies on RosHydroMet’s Meteor-M N2 satellites. This series consists of seven satellites, with an expected temporal coverage from 2014 to 2028.

It is proposed to add three quality elements and a sequence to the Manual on Codes. These are intended for use in encoding spectral observations from the IKFS-2 instrument, along with a selection of quality data.

A previous version of this proposal was sent to CGMS TFSDC on 11 Dec 2019 and received no negative feedback. Since that version the sequence was refined based on inputs received from ECMWF.

Detailed proposal

Add the following elements to BUFR Table B/07 and B/40: Descriptor Name Units Scale Reference Width
0-07-072 Scan angle deg 2 -9000 15
0-40-074 General interferometry quality flags Flag table 0 0 16
Add flag table 0-40-074 with the following contents: Bit No. Meaning
1 Incompatibility of a scan angle for electroencephalogram
2 Calibration failure (limit of black body temperature reached, not enough sources for interferometry, etc.)
3 Geolocation executed taking into account the orientation of the spacecraft and using the star catalogue
4 High level of cryogenic sediment reached, requiring outgassing of the radiation cooler. Set when NESR level of the ice cover threshold crossed
5 Interferometry package flag
6 General accuracy flag
7 Noise present during the interferometry
8 Outgassing of the radiation cooler
9 Flag preceding the first 24 hours/day mark (on as a rule)
10 Telemetry package flag
11-15 Reserved
All 16 Missing value
Add the following sequence 3 40 018 “Infrared Fourier Spectrometer - 2 (IKFS-2) spectra” to BUFR Table D/40: Table references Element name Element description
3 01 129 Observing satellite and instruments
3 01 130 High precision timestamp
3 01 131 Pixel geolocation
0 07 072 Scan angle
0 40 074 General interferometry quality flags
1 04 000 Delayed repetition of 1 descriptor Repeat for all channels
0 31 002 Delayed descriptor replication factor
2 01 136 Change data width
0 05 042 Channel number
2 01 000 Change data width
0 14 044 Channel radiance
marijanacrepulja commented 4 years ago

Hi @erget, many thanks for the BUFR template, ECMWF is looking forward to receiving the data. I have few comments

erget commented 4 years ago

@marijanacrepulja thanks for pointing out the issue with the duplicated bits - it was a table copying error. In fact I'd set the table wider than needed. We do need a reserved bit at the end so that the entry can be set to MISSING if necessary - if we don't reserve the bit at the end, there's no differentiation between all flags being set and no data being reported.

Concerning the ScanAngle I've added the request for a new descriptor that can be used for this and inserted that into the sequence. How do you find that?

marijanacrepulja commented 4 years ago

@erget thanks for the request for a new descriptor.

As for flag table I meant to write that should start with 1 and finished with All 10 Missing value

For example I copied flag table for 0 02 022 Satellite data-processing technique used Bit No. 1 Processing technique not defined 2 Automated statistical regression 3 Clear path 4 Partly cloudy path 5 Cloudy path 6–7 Reserved All 8 Missing value

erget commented 4 years ago

@marijanacrepulja OK I see what you mean, I've updated the formulation of the table.

jitsukoh commented 4 years ago

@erget Just to avoid any confusion, in the detailed proposals, can I understand:

erget commented 4 years ago

@jitsukoh yes, that's correct - not sure how that slipped past me. I've updated the text accordingly.

erget commented 4 years ago

Input from @SibylleK:

erget commented 4 years ago

@SibylleK I have implemented your requests in the proposal above.

chenxiaoxia2019 commented 4 years ago

@erget Hi, Daniel, I have already created the branch for this issue. Could you please check it and the tables? Thanks.

erget commented 4 years ago

@chenxiaoxia2019 thanks for this. I wasn't sure where to put my comments regarding this so I've duplicated them in the PR and in this issue.

Content review

I've reviewed the CSVs only, although I marked the text and XML files as reviewed in order to make it easier to step through the PR.

I don't think BUFRCREX_TableB_en_07.csv is correct - this was proposed as 007XXX in order to signify that the Secretariat would assign a number, as the proposal was made in parallel with other proposals that might request the same number. We definitely need the real number there. I've noted this on the appropriate line in the PR.

There are also some issues in BUFR_TableD_en_40.csv that I have noted in the PR.

Proposed process improvement

I think we should clarify the review process for instances like this and propose the following workflow:

  1. Proposal is moved through the pipeline as is currently the case, no change
  2. Secretariat implements changes and creates a pull request for review
    • We should have a PR template with a checklist saying what files to check; I assume that the Secretariat is editing certain files by hand and generating the rest using scripts. If we trust the scripts then theoretically we only need to check the source files, which I am assuming are the CSVs
    • On that note we could consider generating the legacy formats (e.g. XML) only on merge so that the review is easier
  3. The proposer reviews and approves PR. I don't believe this was part of the process in the past - do we want to do this regularly moving forward? I believe doing the review directly in the PR is better because it allows commenting the changes line-by-line in a straightforward fashion.
erget commented 4 years ago

I've updated the proposal to use the new descriptor numbers provided by the secretariat. The changes I requested in my previous comment are still outstanding on this issue's branch, but in the assumption that they will be resolved as described in the proposal, I've had sample data created. Please let me know when you've downloaded it so that I can free the storage; this is not a permalink.

marijanacrepulja commented 4 years ago

@erget Dimensions for Radiance in channel in the HDF5 file is AtmSpRadiances (3541,15,2701), meaning that we have 2701 channels. In the proposed template we have
0 31 001 Delayed descriptor replication factor that can't accommodate it. I believe we should use 0 3 1 002 instead. Could you please confirm list of descriptors used in encoding the message as I am struggling to decode the sample data. Many thanks.

erget commented 4 years ago

@marijanacrepulja you're right, while producing the sample data we made some changes to that part of the sequence and I forgot to note that in the proposal. I've now updated the proposal - it matches the sample data, which you should not have been able to read using the definitions noted above before.

@chenxiaoxia2019 I'm not sure about the process for creating new branches that match proposals - I think it might be cleanest for this to be done by the secretariat, i.e. you, but theoretically I see that proposers can implement changes to the branches as well. It gets complex when potentially hitting the same numbers and I'm also not familiar with which files need to be edited and which ones get autogenerated... Could you please implement the necessary changes or let me know what the desired procedure is?

chenxiaoxia2019 commented 4 years ago

Hi, Daniel, @erget I have created all the branches for all the issues respecitively. Take issue 10 for an example, branch "issue-10" is created for it, with the changes made in BUFRCREX_TableB_en_40.csv and BUFR_TableD_en_40.csv. The description for these two files is "update issue 10". AND, you can also use the COMPARE https://github.com/wmo-im/BUFR4/compare/issue-10 to see any changes. What the assignee needs to do is to double check the changes with the proposals. For any contradicted numbers, I always get you and the team updated, to either keep the number or propose a new one.

@chenxiaoxia2019 I'm not sure about the process for creating new branches that match proposals - I think it might be cleanest for this to be done by the secretariat, i.e. you, but theoretically I see that proposers can implement changes to the branches as well. It gets complex when potentially hitting the same numbers and I'm also not familiar with which files need to be edited and which ones get autogenerated... Could you please implement the necessary changes or let me know what the desired procedure is?

erget commented 4 years ago

Hi @chenxiaoxia2019 thanks for the explanation. Some points below.

Hi, Daniel, @erget I have created all the branches for all the issues respecitively. Take issue 10 for an example, branch "issue-10" is created for it, with the changes made in BUFRCREX_TableB_en_40.csv and BUFR_TableD_en_40.csv. The description for these two files is "update issue 10". AND, you can also use the COMPARE https://github.com/wmo-im/BUFR4/compare/issue-10 to see any changes.

Thanks, I'm able to find that based on the fact that you added the "Branch" section to the issue description at the top.

It would be really nice to have a CONTRIBUTING.md file in the repository that describes what you're saying, because it's currently not clear IMHO that the CSV files are the ones that should be reviewed.

What the assignee needs to do is to double check the changes with the proposals. For any contradicted numbers, I always get you and the team updated, to either keep the number or propose a new one.

OK. I guess stated more clearly, I've found issues in the branch devoted to this issue that I have annotated in a pull request so that you can see exactly what lines I'm referring to and requested that you implement changes accordingly. Probably you can get by with double-checking the changes in the branch with the current state of the proposal at the top of this issue - I would be happy to cross-check that for you afterwards.

While I think this is a good way to do things, so that in the end the secretariat is the one implementing changes to the tables, it would also be possible to just make the changes in the branch directly so that we don't add another layer of turnaround. I'm working under the assumption that all table changes should be done by the secretariat and hope that this doesn't unduly burden you.

In sum though it's important to note that the branch currently contains errors, and that there is a contradiction in numbers between this and another proposal that will need to be resolved.

marijanacrepulja commented 4 years ago

@erget I have been able to decode the data. However, I haven't been able to use the tables from the branch, as they are not updated. I spot-checked most of my decoded values, and they all appeared to match those in HDF5 file provided. There is no clear mapping between quality info in HDF5 file and newly introduced flag table, therefore I haven't been able to verify the values.
I have attached the first 10 decoded messages, as dump of the sample file exceeds limit.

W_XX-EUMETSAT-Darmstadt,SURFACE+SATELLITE,M02+OCN_C_EUMP_20200611071500_30735_2.0_VV.bin_decoded.gz

erget commented 4 years ago

@marijanacrepulja thanks for looking. @chenxiaoxia2019 would you be willing to update the branch to match the proposal as described at the top of the issue?

chenxiaoxia2019 commented 4 years ago

@erget Hi, Daniel, branch has already been updated as proposed, changing from 007XXX to 007072 in both table B and table D.

erget commented 4 years ago

@chenxiaoxia2019 I'm actually referring to the changes needed as per https://github.com/wmo-im/BUFR4/issues/10#issuecomment-635875111 and https://github.com/wmo-im/BUFR4/issues/10#issuecomment-646195155. You can see them line-for-line as described in https://github.com/wmo-im/BUFR4/pull/25. In summary changes are needed to the branch because beyond replacing 007XXX because:

Can you have a look please? Feel free to write to me if you need my help.

erget commented 4 years ago

@chenxiaoxia2019 I've corrected the issues on the associated branch, everything should be fine now.

amilan17 commented 4 years ago

SUMMARY: Add entry to Table B Class 40, add Flag Table 0 40 074, add entry to Table B Class 07 and add entries to Table D Category 40.

amilan17 commented 3 years ago

@erget We are preparing the manual for publication and need some clarification. What does "on as a rule" mean? 

9Flag preceding the first 24 hours/day mark (on as a rule) 

Some suggestions:

  1. as a rule
  2. set to on as a rule
  3. on is default flag
  4. ...?
erget commented 3 years ago

@amilan17 (2) is right.