Closed nobodyinperson closed 1 week ago
hi @nobodyinperson - looking into this, we'll get back to you as soon as possible
Hi Max, thank you. And while you're at it, the script also dies when the values for e.g latitude actually has a mask, which happens for some reason with xarray. In you example, the mask=False, but if it actually has a masc (boolean array, however xarray does that), uas2bufr fails, because it doesn't filter for non-masked values.
Am Mittwoch, 21. August 2024 schrieb Max Marno: hi @nobodyinperson - looking into this, we'll get back to you as soon as possible
-- Reply to this email directly or view it on GitHub: https://github.com/synoptic/wmo-uasdc/issues/11#issuecomment-2302625692 You are receiving this because you were mentioned.
Message ID: @.***
I finally managed to get uas2bufr
to eat my netCDF file after lots of debugging. It would be really great if uas2bufr
was more robust and would explain errors better. Besides my above points, for example just bubbling up the gribapi error OutOfRangeError: Value out of coding range
without any indication of variable name or actual value makes it quite hard to find the actual problem.
But at least I have a working state I can build on now. 👍
@nobodyinperson you're right, it's very procedural code. I'm glad you've got a working state now, and for your reference, uas2bufr is chiefly based on this repo: https://github.com/marijanacrepulja/uas2bufr we can consider adding more verbose logging and calling out specific variables where applicable but at the moment this work isn't planned. PR's welcomed!
Understood. No need for this issue then anymore, I guess.
I have big problems with the netcdf-to-bufr convertion.
Problem 1: The
eccodes
packageOne problem is that
eccodes
is a notoriously fragile and difficult to install library (the dreadedRuntimeError: Cannot find the ecCodes library
error...), with prebuilt binaries only for very specific python versions, architectures, etc. For now, I managed to build an apptainer based ondocker://python:3.10
and a poetry project with the library versions you detail here:pyproject.toml
```toml [tool.poetry] name = "uas2bufr" version = "0.1.0" description = "WMO UASDC NetCDF to bufr" authors = ["WMOI also tried with poetry2nix, which is well suited for reproducibility, but even that fails with the dreaded
RuntimeError: Cannot find the ecCodes library
error.default.nix
```nix { pkgs ? (importIt would be really helpful if you provided a reproducible, platform-independent way of running the
uas2bufr
script.Debugging via the
__error.txt
files landing in the S3 bucket following failure after upload is not viable, obviously.A Docker image, an Apptainer image, a working nix environment, anything really that just allows running the
uas2bufr
script reliably.Problem 2: Broken
uas2bufr
scriptNow that reproducing the error from the generated
__error.txt
in the S3 bucket is possible, I could reproduce it locally, and added some debugging prints:Apparently, time unit parsing is implemented with hard-coded assumptions. In this case, the date
19
is tried to be split by colon:
for some reason and assumed to contain the fractional seconds. Instead of doing fragile character splitting, I recommend using robust time libraries such as the built-indatetime
or justpandas.to_datetime
.Our campaign ends this week so unless I find a solution to this, our data can't land in the S3 bucket right after the flights.