Open zklaus opened 1 year ago
If one uses JasPer versions >=3 with the threads disabled, then it all works fine. However when we try to enable threads (POSIX or OpenMP) and decode/encode in parallel with multiple threads, then it fails. This is still under investigation
Thanks @shahramn. So this is an upstream problem, i.e. we know the problem is in eccodes?
I think I understand what the problem is. In a multithreaded setting, only the main thread is allowed to call jas_init_library
. All threads (including the main one) must call jas_init_thread
and later jas_cleanup_thread
, but again only the main one must call jas_cleanup_library
(this is all explained here.
At the moment, the initialization happens in grib_jasper_encoding.c:ecc_jasper_initialise
, but in a way that makes all threads call jas_init_library
(and later jas_cleanup_library
in ecc_jasper_cleanup
). You need to identify the main thread and make sure that only that one calls jas_init_library
and that alls threads call jas_init_thread
after that point.
Thank you so much for looking into this. I will create a bug report and capture this information
Thank you all for working through this, do you have a reference for the bug report you reported upstream?
@iainrussell @shahramn Any chance this has been fixed or is being addressed upstream? Currently this pulls in jasper 2 -> jpeg, which means you can't install new builds that are linked against jpeg_turbo 2.1.5.1.
This means you can't install the latest libnetcdf 4.9.2 (and its S3 fixes, etc.) alongside eccodes (and pygrib, cfgrib) in conda-forge.
Is there an actual issue? We didn't pull the packages because we didn't have a reproducer. But if there is a real bug, then we should address it
Ah, nevermind (my apologies for the noise), while it might still need addressing, I missed the responsible party. Looks like pygrib (which uses eccodes, hence the confusion) is what triggers the old install.
Current situation to the best of my understanding: the latest ecCodes (2.30.0) temporarily disables the multi-threading test when jasper is linked, so the tests should all pass now. But the underlying problem has not yet been fixed, so using threads while decoding jpeg-encoded GRIBs with jasper will not work.
Solution to issue cannot be found in the documentation.
Issue
It seems our build of eccodes does not fully work with our builds of Jasper >=3.
This surfaced in this test as part of the netcdf491 migration, but I suspect that it simply is due to the Jasper upgrade that happened since the last eccodes build. The latest eccodes package available is built against Jasper 2.
I suggest pinning Jasper to <3 for now (this works; see #152) and to investigate then wether the problem is in Jasper, in Eccodes, or in our builds of either.
Installed packages
Environment info