NOAA-EMC / NCEPLIBS-bufr

The NCEPLIBS-bufr library contains routines and utilites for working with the WMO BUFR format.
Other
40 stars 19 forks source link

python API still doesn't build on WCOSS2? #600

Closed jbathegit closed 1 month ago

jbathegit commented 1 month ago

@rmclaren @aerorahul @edwardhartnett @AlexanderRichert-NOAA @climbfuji @jswhit have any of you ever gotten the full NCEPLIBS-bufr including -DENABLE_PYTHON=ON to build and run properly on WCOSS2, either via the Intel or GNU compilers? If so, could you please help me figure out what I'm doing wrong.

I can't build it under Intel with python/3.12.10 because it can't find the numpy module. And if I try to build under GNU with python/3.8.6 (which is the latest version of python they have available for GNU), then I get a boatload of build diagnostics and then at runtime it can't do the imports correctly:

1/101 Testing: test_pyncepbufr_checkpoint
1/101 Test: test_pyncepbufr_checkpoint
Command: "/apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/bin/cmake" "-E" "env" "PYTHONPATH=/lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/build3/python:/apps/prod/python-modules/3.8.6/gcc/10.2.0/lib/python3.8/site-packages:/apps/ops/prod/nco/core/prod_util.v2.0.14/ush" "/apps/spack/python/3.8.6/gcc/10.2.0/jsduzkud5ggl6jrg6lm4h7xmub5nq3ay/bin/python3.8" "/lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/python/test/test_checkpoint.py"
Directory: /lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/build3/test/testfiles
"test_pyncepbufr_checkpoint" start time: Jun 04 16:39 UTC
Output:
----------------------------------------------------------
Traceback (most recent call last):
  File "/lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/python/test/test_checkpoint.py", line 2, in <module>
    import ncepbufr
  File "/lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/build3/python/ncepbufr/__init__.py", line 1, in <module>
    import _bufrlib
ImportError: /lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/build3/python/_bufrlib.cpython-38-x86_64-linux-gnu.so: undefined symbol: __asan_option_detect_stack_use_after_return
<end of output>
Test time =   0.11 sec
----------------------------------------------------------
Test Failed.
"test_pyncepbufr_checkpoint" end time: Jun 04 16:39 UTC
"test_pyncepbufr_checkpoint" time elapsed: 00:00:00
----------------------------------------------------------

Any help would be appreciated, because I'm trying to chase down a runtime error in #599, which is hard to do if I can't reproduce whatever the problem is locally.

AlexanderRichert-NOAA commented 1 month ago

Are you building the library with memory sanitization (asan)?

jbathegit commented 1 month ago

I was since that's my default configuration for GNU, but I can try turning that off if need be(?)

AlexanderRichert-NOAA commented 1 month ago

Just based on the output it might be worth disabling, then if that still doesn't work I can take a crack at it.

aerorahul commented 1 month ago

@jbathegit python 3.12.10 on wcoss2 does not have numpy. In order to build the python extensions, a virtualenv with python 3.12.10 and numpy, etc. will need to be created and loaded as a prerequisite.

jbathegit commented 1 month ago

@aerorahul thanks, but setting up something like that for Intel is a bit out of my zone of expertise, so I'd need some help to proceed down that road.

@AlexanderRichert-NOAA thanks for the tip, and it does build and run now in GNU with asan and other flags disabled, at least within ctest. Looking at the test log, it looks like the command it ran was:

Command: "/apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/bin/cmake" "-E" "env" "PYTHONPATH=/lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/build6/python:/apps/prod/python-modules/3.8.6/gcc/10.2.0/lib/python3.8/site-packages:/apps/ops/prod/nco/core/prod_util.v2.0.14/ush" "/apps/spack/python/3.8.6/gcc/10.2.0/jsduzkud5ggl6jrg6lm4h7xmub5nq3ay/bin/python3.8" "/lfs/h2/emc/obsproc/noscrub/jeff.ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/python/test/test_misc.py"

but I haven't been able to figure out how to run that manually outside of ctest. If I try to do that it just gives me a "Permission denied", which I'm still trying to chase down. But some progress at least, so thanks again for that!

jbathegit commented 1 month ago

I'm able to work with this now if I just stay inside the ctest build environment - thanks everyone!