JCSDA-internal / ioda-converters

Various converters for getting obs data in and out of IODA
8 stars 2 forks source link

[Bug] Test failures on MacOS with converters using eccodes/cffi #1456

Open srherbener opened 5 months ago

srherbener commented 5 months ago

Current behavior (describe the bug)

I am seeing the following tests fail on my Mac:

99% tests passed, 7 tests failed out of 657

Subproject Time Summary:
ioda    = 106.25 sec*proc (328 tests)

Label Time Summary:
executable           =  12.74 sec*proc (29 tests)
fortran              =   0.37 sec*proc (1 test)
iodaconv             = 151.94 sec*proc (328 tests)
iodaconv_validate    =  41.02 sec*proc (176 tests)
mpi                  =  15.97 sec*proc (33 tests)
script               = 219.80 sec*proc (519 tests)

Total Test time (real) = 258.84 sec

The following tests FAILED:
    677 - test_iodaconv_mrms (Failed)
    756 - test_iodaconv_generic_gnssro_bufr (Failed)
    762 - test_iodaconv_generic_bufr_raob (Failed)
    763 - test_iodaconv_amdar (Failed)
    764 - test_iodaconv_buoy (Failed)
    765 - test_iodaconv_ship (Failed)
    766 - test_iodaconv_synop (Failed)
Errors while running CTest

These appear to be converters using eccodes/cffi. Here is a excerpt from the LastTest.log:

"test_iodaconv_mrms" start time: Feb 07 11:32 MST
Output:
----------------------------------------------------------
Traceback (most recent call last):
  File "/Users/stephenh/projects/CONVERTERS/ioda-bundle/build/bin/mrms_grib2ioda.py", line 17, in <module>
    import eccodes
  File "/Users/stephenh/spack-stack/envs/unified-env.mymacos/install/apple-clang/15.0.0/py-eccodes-1.5.0-lcho6v3/lib/python3.10/site-packages/eccodes/__init__.py", line 14, in <module>
    from .highlevel import *  # noqa
  File "/Users/stephenh/spack-stack/envs/unified-env.mymacos/install/apple-clang/15.0.0/py-eccodes-1.5.0-lcho6v3/lib/python3.10/site-packages/eccodes/highlevel/__init__.py", line 2, in <module>
    from .reader import FileReader, MemoryReader, StreamReader  # noqa
  File "/Users/stephenh/spack-stack/envs/unified-env.mymacos/install/apple-clang/15.0.0/py-eccodes-1.5.0-lcho6v3/lib/python3.10/site-packages/eccodes/highlevel/reader.py", line 77, in <module> 
    def pyread_callback(payload, buf, length):
  File "/Users/stephenh/spack-stack/envs/unified-env.mymacos/install/apple-clang/15.0.0/py-cffi-1.15.1-yy3zs6k/lib/python3.10/site-packages/cffi/api.py", line 396, in callback_decorator_wrap
    return self._backend.callback(cdecl, python_callable,
MemoryError: Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks
<end of output>
Test time =   0.44 sec
----------------------------------------------------------
Test Failed.
"test_iodaconv_mrms" end time: Feb 07 11:32 MST
"test_iodaconv_mrms" time elapsed: 00:00:00
----------------------------------------------------------

To Reproduce

What computer are you running on? MacBook Pro, Sonoma 14.2.1

What compilers/modules are you using? apple clang 15.0.0 gnu fortran 12.2.0

Steps to reproduce the behavior

  1. build jedi-bundle or ioda-bundle with converters enabled
  2. run ctests

Expected behavior

All ctests pass

Additional information (optional)

PatNichols commented 5 months ago

@srherbener I assume this was with spack-stack 1.6 ?
note the problem is actually in eccodes. Either there is a bug there or the security options are not right to allow callback that eccodes needs. From the ffi doc: https://cffi.readthedocs.io/en/latest/using.html#callbacks%20%3Cend%20of%20output%3E

Warning Callbacks are provided for the ABI mode or for backward compatibility. If you are using the out-of-line API mode, it is recommended to use the extern “Python” mechanism instead of callbacks: it gives faster and cleaner code. It also avoids several issues with old-style callbacks:

On less common architecture, libffi is more likely to crash on callbacks (e.g. on NetBSD);

On hardened systems like PAX and SELinux, the extra memory protections can interfere (for example, on SELinux you need to run with deny_execmem set to off).

On Mac OS X, you need to give your application the entitlement com.apple.security.cs.allow-unsigned-executable-memory.

Note also that a cffi fix for this issue was attempted—see the ffi_closure_alloc branch—but was not merged because it creates potential memory corruption with fork().

In other words: yes, it is dangerous to allow write+execute memory in your program; that’s why the various “hardening” options above exist. But at the same time, these options open wide the door to another attack: if the program forks and then attempts to call any of the ffi.callback(), then this immediately results in a crash—or, with a minimal amount of work from an attacker, arbitrary code execution. To me it sounds even more dangerous than the original problem, and that’s why cffi is not playing along.

To fix the issue once and for all on the affected platforms, you need to refactor the involved code so that it no longer uses ffi.callback().

srherbener commented 5 months ago

@PatNichols thanks for the info on this issue. This was run with a spack-stack built from develop branches and is a bit newer than spack-stack-1.6.0. This one is using py-eccodes/1.5.0 and eccodes/2.32.0.

I suspect this issue might be that my Mac is on Sonoma 14.2.1 which probably has tightened down the JIT restrictions which are noted in the "On Mac OS X" link in your comment.

I think the preferred option would be for py-eccodes to replace the calls to ffi.callback() as you noted. @BenjaminRuston do you know if the latest py-eccodes/1.6.1 has any updates related to this? Thanks!