ARPA-SIMC / bufr2netcdf

Tools to convert BUFR weather reports in NetCDF file format
GNU General Public License v2.0
5 stars 0 forks source link

Error "vars mismatch" in conversion #11

Closed dcesari closed 5 months ago

dcesari commented 6 months ago

The attached bufr file min.bufr.gz is a reduced subset of a bigger file with hundreds of messages. When converting each message of the original file to netcdf separately everything works, but with some combinations of messages such as the one attached, the conversion fails with the error

out of sync at 0: vars mismatch (007007 != 010007)

I can provide the full file if that may help to evaluate the full spectrum of messages.

Decoding of the full file with wrep and dbamsg work correctly.

P.S. sorry the file is a zip but i erroneously used the .gz suffix

spanezz commented 6 months ago

What happens is that the first BUFR expresses height as 07007 and builds a plan to fit height like than into a NetCDF array.

Then the second messages comes along and tries to add to the same NetCDF array, but it expresses heights as 010007 instead, which does not match the existing mapping plan.

In this case I feel like the two variables should be treated equally, and I propose to normalize 07007 to 10007 at least in this kind of operation

spanezz commented 6 months ago

On the other hand, 07007 maps to an array called MH, while 10007 maps to an array called MHVEH. If that is to be respected, it looks like those two BUFRs cannot fit into the same NetCDF

dcesari commented 6 months ago

Thank you for remarking that. I think that they meant the same thing in the head of who coded the data, however 07007 seems more appropriate since that is a coordinate defining a context (height of subsequent observations, wind profiler). Just a question: would it be possible to trigger the start of a new netcdf file when a different value for the height is encountered, as it is done in other situations, or it would be tricky?

spanezz commented 6 months ago

It would. Question: suppose that there's a big pack of BUFR files, alternating one with 07007 and one with 10007: naively that would create one NetCDF file per BUFR.

To do a better job, I could maintain multiple NetCDF plans in memory at the same time, and dispatch BUFRs into the first one that fits. We'd need to have a naming scheme for the generated NetCDF files since we'd be generating more than one, but that seems also doable.

Should I start working in this direction?

dcesari commented 6 months ago

I think that we already have a naming scheme for multiple netcdf files, something like .nc for the first file, then .1.nc, *.2.nc, etc. However, please wait, it is worth to check whether the Cosmo software accepts netcdf with both descriptors, otherwise it would be better (and simpler probably) to force a conversion. I'll give you some updates.

spanezz commented 6 months ago

To do a better job, I could maintain multiple NetCDF plans in memory at the same time, and dispatch BUFRs into the first one that fits. We'd need to have a naming scheme for the generated NetCDF files since we'd be generating more than one, but that seems also doable.

It might not be as straightforward: suppose I add some data to the array in one NetCDF and then realise that another part of the BUFR doesn't fit, then I'd need to rollback the previous additions, and try adding from scratch to a new NetCDF. There's nothing here that isn't doable, but it may be more work than it looked

dcesari commented 6 months ago

Thanks, let's give up with this idea. I informed the bufr provider, either we will find a way to preliminary convert the data or, as a last resort, we will ask to implement the normalization of 10007 to 07007 in bufr2netcdf, but that should probably be under the control of a user flag, because we cannot exclude that, in some cases, it is meant to be really 10007.

dcesari commented 6 months ago

As expected, it is not simple to convert the data upstream, so I would ask you to implement the optional remapping of 10007 to 07007 in bufr2netcdf.

spanezz commented 5 months ago

While studying for implementing the remapping I found another issue reported to this that I reported as #13