Closed dcesari closed 6 years ago
Any progress on this issue?
I started looking into it, it's not quite straightforward. I'm now refactoring wreport a bit to make assignment of attributes to variables more inspectable
I pushed a new version which implement wrep --trace
(or -t
), which prints a (still partial) trace of the decode operation.
When running it, I noticed that quality information (% confidence, manual/automatic quality control, nominal confidence threshold) are sent multiple times, each time for a different value of generating application.
When existing attributes are decoded, wreport overrides the existing values, and since the last generating application has 0 for all % confidence, 0 is the value that ends up in the B33007 attribute.
How would you like this kind of data to be handled?
Note that there is currently no way to store more than one attribute with the same code on the same variable.
A proposal: would it be possible to keep the last valid confidence value, if any (i.e. not override a valid value with an undef)? Since it would be possible to trick the bufr externally and set to undef the undesired confidence values, this modification would be enough. N.B. the final request concerns bufr2netcdf, so it should work at a library level, not in the command line tools.
The consensus so far is:
@dcesari is it possible to have a BUFR with undefs instead of zeroes to try? I thought about implementing it, and chances are that the existing implementation works already that way
I managed to produce two files:
B005_2018031500_onenull.bufr
the third occurrence of "033007 PER CENT CONFIDENCE" for pressure is set to null for all values of pressure
B005_2018031500_twonull.bufr
the second and third occurrences of "033007 PER CENT CONFIDENCE" for pressure are set to null for all values of pressure
The second is actually what will be desired, i.e. keep the first occurrence of confidence and neglect second and third. Unfortunately wrep reports undef in both cases, meaning it keeps the last regardless of it being undef or not.
With the patch I pushed, it keeps the last non-null value found for an attribute: if you prefer the first, let me know and it's an easy change.
As a future development, I was thinking that it could be possible to make wreport configurable to whitelist/blacklist variables and attributes based on context values. For example, make it so that one could say "keep only attributes from generating application 2". Then one could read the same BUFR multiple times with multiple whitelists/blacklists, to get everything out. If this would be interesting, feel free to copypaste what I said in a new issue.
Thanks for the proposal, for the moments it is enough as is. I confirm that it works as expected.
In the attached bufr.zip file (satellite derived products, with subsets) wreport incorrectly decodes the data confidence information B33007:
The same file dumped with bufr_dump from eccodes or with other emos tools gives values !=0 for B33007, while wreport gives sistematically zero for every occurrence of the descriptor in this and other files.