I'm attempting to parse a grib2 file that, unfortunately, has multi-field grib messages encoded. A copy of the grib2 file from the RAP can be found here. I know that this feature is not recommended, but unfortunately, NCEP and other data providers have not heeded such advice. For multiple reasons, I need to be able to read the file through byte streaming, as described in this documentation. However, after a significant amount of time beating my head against a wall, I cannot get eccodes to properly handle byte stream messages that have multiple fields encoded. The only way I can get this to work is when using the codes_new_from_file API, and I cannot brute-force using BytesIO objects due to #25.
The motivation behind this is to fix an issue with the Kerchunk library currently being unable to parse multi-field messages. Since this is meant for scanning and reading byte ranges from cloud data sources, it is kind of crucial to be able to handle byte streams and parse a multi-field encoded message... but I'm increasingly getting the sense that the eccodes core library doesn't support this either.
I realize this issue may be rooted in the core eccodes library that this python repo wraps, but I wanted to start here first just in case. Is there anything I can do differently to achieve the desired functionality, or is Kerchunk's desire to read grib messages through byte streaming effectively unattainable?
Edit: I forgot to mention, calling eccodes.codes_grib_multi_support_on() before eccodes.codes_new_from_message(bytes(buf)) appears to have no effect.
What happened?
I'm attempting to parse a grib2 file that, unfortunately, has multi-field grib messages encoded. A copy of the grib2 file from the RAP can be found here. I know that this feature is not recommended, but unfortunately, NCEP and other data providers have not heeded such advice. For multiple reasons, I need to be able to read the file through byte streaming, as described in this documentation. However, after a significant amount of time beating my head against a wall, I cannot get eccodes to properly handle byte stream messages that have multiple fields encoded. The only way I can get this to work is when using the
codes_new_from_file
API, and I cannot brute-force using BytesIO objects due to #25.The motivation behind this is to fix an issue with the Kerchunk library currently being unable to parse multi-field messages. Since this is meant for scanning and reading byte ranges from cloud data sources, it is kind of crucial to be able to handle byte streams and parse a multi-field encoded message... but I'm increasingly getting the sense that the eccodes core library doesn't support this either.
I realize this issue may be rooted in the core eccodes library that this python repo wraps, but I wanted to start here first just in case. Is there anything I can do differently to achieve the desired functionality, or is Kerchunk's desire to read grib messages through byte streaming effectively unattainable?
Edit: I forgot to mention, calling
eccodes.codes_grib_multi_support_on()
beforeeccodes.codes_new_from_message(bytes(buf))
appears to have no effect.What are the steps to reproduce the bug?
The message bytes in question:
Version
2.36.0
Platform (OS and architecture)
Linux DESKTOP-5DTCKL2.attlocal.net 6.9.12-100.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Jul 27 16:09:11 UTC 2024 x86_64 GNU/Linux
Relevant log output
No response
Accompanying data
No response
Organisation
NOAA