ufal / morphodita

MorphoDiTa: Morphologic Dictionary and Tagger
Mozilla Public License 2.0
69 stars 7 forks source link

Segfault when used from Python alongside another SWIG-generated lib #14

Closed dlukes closed 4 years ago

dlukes commented 4 years ago

I've stumbled upon a segfault when using MorphoDiTa from Python alongside Manatee (another SWIG-generated library). Specifically, if I import Manatee after MorphoDiTa, then MorphoDiTa segfaults when calling tokenizer.nextSentence. Here's a reproducible example with more details (using Docker): https://github.com/dlukes/morphodita-manatee-python-segfault.

I have also reported this issue to the developers of Manatee on their mailing list as it's unclear to me at this point what the root cause is -- it may be that MorphoDiTa/NameTag doesn't isolate itself well enough from other extension modules, or it may be that Manatee oversteps its boundaries. Given that loading e.g. both MorphoDiTa and NameTag into the same Python interpreter works fine, I'm somewhat leaning towards the latter though.

I realize you may not have bandwidth to look at this since it's a fairly niche issue, and it can be worked around by not having all of these at once in the same Python process. However, if the root cause is a bug lurking somewhere in MorphoDiTa's SWIG interface definition, it may have other unwanted consequences, so I deemed it worth reporting anyway.

foxik commented 4 years ago

This is always very tricky. It can easily be a problem outside SWIG, just in some dynamic linking stuff... I will try looking into it reasonably soon.

foxik commented 4 years ago

BTW, thanks for the reproducible example!

dlukes commented 4 years ago

Thanks for considering looking into this!

foxik commented 4 years ago

OK, it was not that hard (only several hours :-) -- see http://www.swig.org/Doc4.0/Modules.html#Modules_nn2 for technical reference.

TL;DR: different SWIG modules share type tables, so that C++ object inheritance can work across SWIG modules. Furthermore, both MorphoDiTa and Manatee define vector<string> type, and Manatee defines several non-standard methods of this type. Therefore, these types are incompatible across MorphoDiTa and Manatee, and if Manatee was called second, it overwrote the type by its definition, which broke MorphoDiTa. I assume the other order would also break Manatee (any method which worked with vector<string> type).

The referenced manual also describes a way to fix the issue -- each SWIG module can define a "namespace" (SWIG_TYPE_TABLE) such that only SWIG modules in the same "namespace" share types -- so we can set it to ufal_morphodita in MorphoDiTa module and the issue is fixed (checked).

As we will release a new version "soon" (with Python 3.8 support and binary wheels), I will include the fix at that time. If you want to solve the issue in the meantime, it is enough to add #define SWIG_TYPE_TABLE ufal_morphodita before line 13 in bindings/python/morphodita_python.i.

Leaving open until we release a fix.

dlukes commented 4 years ago

Wow, hats off, and thanks for sharing the details of your research! I was really curious what the cause was :) Glad to see it has a reasonable explanation and a standard fix.

Out of curiosity, how did you go about debugging this? I mean very generally, I'm sure you don't have time to go into the details. For instance, is there a way to debug Python extensions with gdb? SO seems to indicate there is, but I've no idea how viable that approach is. Or did you mostly spend time scouring SWIG's docs because you already had hunches about possible causes based on prior experience?

I'd just like to have a vague idea of how it's done because at this point, if I ever had to fix an issue like this, I would be randomly stabbing in the dark :)

I'll let the people on the Manatee mailing list know you've solved this, so that they don't have to duplicate the effort. Thank you very much once again!

foxik commented 4 years ago

Ok, the pandemic stopped my work on the software, but I am back :-) The fix has been comited, and there should be a release in several days.

As for the debugging approach -- yes, you can debug Python extensions with gdb (I just debugged the whole Python process, using and extension with debug information) -- so I saw the crash is happening in SWIG internal methods (some invariant was not fulfilled). So I was looking around how that could happen, but did not find the cause. I then went to the SWIG documentation for Python, but was no better off reading the whole page, and the same for SWIG and C++. Lastly I tried googling and one of the searches directed me to the Module page from SWIG documentation, where the SWIG_TYPE_TABLE is.