csiro-hydroinformatics / uchronia-time-series

Library for ensemble forecast time series
Other
0 stars 0 forks source link

On some versions of Windows, importing the uchronia module crashes the python process. #1

Open jmp75 opened 6 days ago

jmp75 commented 6 days ago

Reported by my colleague Seline.

Repro

On Windows 11 machines (and at least windows server 2022), importing the python module uchronia crashes the python process and it exists. I've not observed first hand but Seline mentioned seeing a message "Process finished with exit code -1073741819 (0xC0000005)".

import uchronia

Stepping in debug model I can get as far as the call to register a callback function:

[uchronia_so.RegisterExceptionCallback(_exception_callback_uchronia) permalink]https://github.com/csiro-hydroinformatics/uchronia-time-series/blob/959a81f4290282c3fee45ea5b57af4d53b381578/bindings/python/uchronia/uchronia/wrap/ffi_interop.py#L89

where the callback is declared using an "old style" syntax. Been a while.

@uchronia_ffi.callback("void(char *)")
def _exception_callback_uchronia(exception_string):
    """
        This function is called when uchronia raises an exception.
        It sets the global variable ``_exception_txt_raised_uchronia``

        :param cdata exception_string: Exception string.
    """
    global _exception_txt_raised_uchronia
    _exception_txt_raised_uchronia = uchronia_ffi.string(exception_string)

Resources

There is a Warning section that Callbacks are provided for the ABI mode or for backward compatibility. Given we are using the ABI mode, not sure whether we can migrate the new-style.

arigo commented 6 days ago

The callback is not immediately called, I suspect? The crash occurs when registering the callback only?

Do you get a crash on the same Win11 platform if you make a minimal example: just a Python script, which declares a similar callback with @ffi.callback(), and which just tries to get the low-level address of that callback, e.g. with ffi.cast("int64_t", my_demo_callback)?

jmp75 commented 5 days ago

Yes it is crashing when registering. I will make a smaller repro; I need to anyway to have a failing unit test to start from.

jmp75 commented 5 days ago

Added a minimal repro, which fails to reproduce the issue so far. https://github.com/jmp75/py-cffi-callback-repro

jmp75 commented 5 days ago

I managed to get a stack trace. It does look like the call to RegisterExceptionCallback in c/c++ land is where things go awry. This is using a freshly recompiled dll built with the VS 16 (2019) vcpp compiler.

>   00000000000ab160()  Unknown
    datatypes.dll!00007ff93a70ad8c()    Unknown
    _cffi_backend.cp311-win_amd64.pyd!00007ff958d310f3()    Unknown
    _cffi_backend.cp311-win_amd64.pyd!00007ff958d49d6c()    Unknown
    _cffi_backend.cp311-win_amd64.pyd!00007ff958d373de()    Unknown
    python311.dll!00007ff970b083de()    Unknown
    python311.dll!00007ff970b0bddf()    Unknown

It may be time to load the libraries with debug symbols, see if we can peer in more details.

While simplified on the edges, the minimal example is structurally very similar. This is very puzzling why it runs without issue.