pik-copan / pyunicorn

Unified Complex Network and Recurrence Analysis Toolbox
http://pik-potsdam.de/~donges/pyunicorn/
Other
195 stars 86 forks source link

BUG: segfault when calling `CouplingAnalysis.mutual_information()` and `CouplingAnalysis.information_transfer()` #187

Closed fkuehlein closed 9 months ago

fkuehlein commented 11 months ago

Might be a similar problem as in issue #128, pointing to an indexation error in an underlying C function.

Discovered this behaviour when running the test suite. The corresponding tests test_mutual_information and test_information_transfer currently remain the only ones failing.

coupling_analysis.mutual_information() and coupling_analysis.information_transfer() are the only methods in class calling coupling_analysis.get_nearest_neighbors(), which relies on funcnet._ext.numerics._get_nearest_neighbors() and the C function _get_nearest_neighbors_fast(), so the bug might be somewhere in the latter two.

@ntfrgl, it'd be awesome if you could have a look at this! I wasn't able to trace it back any further as I haven't installed a cython debugger yet, so I was hoping you might find the bug just by staring as you did in 0ce39ae. :)

If it really is a similar issue, I suppose you'd port those modules to cython as you did in b7e546d. Let me know if I should try and do that myself then, but it'd certainly take me a bit more time.

fkuehlein commented 11 months ago

162 looks like it might be related to this issue. The cython/C functions

_cross_correlation_max()/_cross_correlation_max_fast()

and

_cross_correlation_all()/_cross_correlation_all_fast()

used in CouplingAnalyis should therefore be looked into as well.

ntfrgl commented 11 months ago

The test suite passes on the freshly installed master branch for me, hence I am unable to reproduce this issue at the moment. You might want to look at my recent commits touching these files and see if reverting them changes your situation: 6dc0f63, 402197f, 21b0b6f. For example, if reverting the third commit resolved your issue, then that might indicate a platform-specific problem with the ALLOCA macro defined there.

funcnet/_ext/src_numerics.c::_get_nearest_neighbors_fast() is the last function in that file which has not been Cythonized yet. It would be great if you could port it, examine the remaining C functions (compare them against the legacy Weave code in the original commits and the current Cython versions), and delete the C file.

fkuehlein commented 11 months ago

That's curious. I did a fresh install on a new conda environment and still get the segfaults wenn running the test suite via tox.

I admit my system is not the latest, running on macOS 10.13, not sure if that might be a problem. I still have a conda=23.3 and python=3.10 though.

I could not do full reverts of the above commits because that breaks the cython compilation on reinstall, so I tried reverting only changes concerning funcnet (including setup.py and core/_ext/types.{pxd|py}) for all the above commits:

which all compiled fine, but didn't fix the segfault.

I also tried python 3.9 and 3.11 but no difference. (won't install on python=3.8 because numpy=1.25 needs python>=3.9)

Any more ideas at this point? Otherwise I will try porting funcnet/_ext/src_numerics.c::_get_nearest_neighbors_fast() to cython and see if that fixes it.

Thanks for the hint by the way, didn't realize funcnet/_ext/src_numerics.c::_cross_correlation_{max|all}_fast() were unused. _symmetrize_by_absmax_fast() is also still in there though.

ntfrgl commented 11 months ago

This sounds like a bug whose manifestation is compiler/platform-dependent. Cythonizing would therefore probably be of comparable effort to debugging the C version, and reducing the language diversity would be beneficial independently of fixing this bug. Sorry, I overlooked _symmetrize_by_absmax_fast(), but the same reasoning applies for it.

I fixed the Numpy issue in a9a70b2.

fkuehlein commented 11 months ago

Thanks for fixing the numpy issue! Will look into cythonizing the remaining C functions for CouplingAnalysis as soon as possible.