Closed Quuxplusone closed 4 years ago
Bugzilla Link | PR43830 |
Status | RESOLVED FIXED |
Importance | P enhancement |
Reported by | sguelton@redhat.com |
Reported on | 2019-10-28 09:42:32 -0700 |
Last modified on | 2019-11-21 16:05:56 -0800 |
Version | 9.0 |
Hardware | PC Linux |
CC | jdevlieghere@apple.com, labath@google.com, llvm-bugs@lists.llvm.org, mgorny@gentoo.org, tstellar@redhat.com |
Fixed by commit(s) | rG9357b5d, rGdf3ae1e, rG186c848, rG6d7bc60 |
Attachments | |
Blocks | PR43360 |
Blocked by | |
See also |
I've seen a similar problem with 3.7 yesterday but attributed it to broken local build. I think it used to work, so you may want to try bisecting it.
Educated guess based on first investigation: there might be a symbol confusion between libedit, as used by lldb, and read readline, as bundled in Python. Still investigating.
More info on that one:
lldb is linked against libedit, and libpython dynaically loads readline.cpython-36m-x86_64-linux-gnu.so which is itself linked against libreadline. Both library define rl_initialize
, and in the lookup order, libedit comes before libreadline so when the readline python native extension initializes, it calls the libedit one, which wreaks havoc.
I managed to workaround that issue by injecting a fake readline module into the embeded Python interpreter before it loads
static PyObject* fakereadline() {
PyErr_SetImportError(PyUnicode_FromString("readline"), nullptr, nullptr);
return nullptr;
}
(...)
PyImport_AppendInittab("readline", &fakereadline);
This basically defines a new readline module that superseds the default one, and raises an import error when it is imported, which seems ok from the lldb perspective. It could also return an empty module.
This is half hacky, half decent, any thoughts/alternatives?
That would mean we lose the interactive command line in the built-in python
interpreter, right? That is something people do care about, as we've in the
past had a fake "readline" module to make this work. However, it was removed in
<https://reviews.llvm.org/D59972>, because we got the impression it is not
needed anymore -- it seems we were wrong.
As an alternative, maybe we could avoid linking against libedit.so, and either:
a) link it statically (libedit.a). Then our linker version script should ensure
that these symbols are not available to python. This would be my preferred
solution, but I don't know if there are any environments which have libedit.so,
but not .a. Potentionaly, we could just prefer linking to the .a file, and fall
back to .so (maybe with a warning) if the .a is not available.
b) use dlopen(RTLD_LOCAL). This should achieve the same effect but still enable
dynamic linking. I don't know if there is any ld(1) flag which would enable us
to achieve the RTLD_LOCAL semantics without needing to dlopen/dlsym everything.
BTW, this is now failing for me too, but there have been a couple of changes since I last ran the test suite, so it's hard to pin down the exact problem. I'll try to do some bisection to figure that out...
Bug reported on cpython side: https://bugs.python.org/issue38634
And a cpython PR: https://github.com/python/cpython/pull/16986
Nice detective work. It sounds like the root cause is libedit incompatibility, but as python already has code to work around that, I'm hoping they won't have a problem with extending the workaround to other platforms. I'll also try to contact the libedit maintainer to see if this incompatibility can be fixed in libedit.
Unfortunately, I think we will still need somehow address this in lldb too, since it will take a while before the fixed python versions are readily available.
In other news, I've figured out why this has started "suddenly" failing for me -- reconfiguring the tree (which I've needed to done for the github switch) chose python3 by default, whereas I've previously had python 2.7 selected. Switching back to python2 makes the problem "disappear".
Should we just add the old workaround back?
Pavel, when you say this repros with Python 3, is that 3.6 or 3.7?
(In reply to Jonas Devlieghere from comment #8)
> Should we just add the old workaround back?
Yeah... maybe? Though that makes me sad as I found that workaround quite gross
and was very happy to see it go... And I understand a lot of linux distros were
not shipping it because installing it would mean it would override the readline
module globally. mgorny may know more about that.
I think I'd prefer some combination of (a) and (b) from #4.
>
> Pavel, when you say this repros with Python 3, is that 3.6 or 3.7?
This was with python 3.6, but judging by the nature Serge's fix, I'd expect
that the same problem is manifested on a wide range of python versions (I have
no idea why 2.7 is immune).
Honestly, the solution I'd really prefer to see is ability to build CPython with libedit instead of readline. But that's some tedious work, and I suspect it will take real effort to convince Python upstream to accept it.
But that's some tedious work, and I suspect it will take real effort to convince Python upstream to accept it.
I'm more optimist than this :-) Hoewer it may not be backported so we still need to have an option.
I've been inspecting https://reviews.llvm.org/D59972 and it turns out we can easily build the module and only load it inside the Python interpreter embeded into lldb, hotpatching the host readline. This would prevent the existing issue without any impact on the system.
I'll propose a review going that way.
(In reply to Michał Górny from comment #10)
> Honestly, the solution I'd really prefer to see is ability to build CPython
> with libedit instead of readline. But that's some tedious work, and I
> suspect it will take real effort to convince Python upstream to accept it.
Judging by the last comment on <https://github.com/python/cpython/pull/16986>,
it looks like they would be open to that. However, won't that just reverse the
problem? I.e., if you have a libedit version of python but embed it into a
readline application (gdb?), won't you still get a crash ?
What would be the ideal solution IMO would be some version of RTLD_DEEPBIND
dlopen flag, but which works for static dynamic linking (whatever you want to
call it -- DT_NEEDED). That way, python would always get the library it was
linked against, instead of some random library in the same process...
This is a work in progress but it solves the issue on my config, while not polluting the python global namespace.
https://sergesanspaille.fedorapeople.org/0001-Revert-Python-Remove-readline-module-patches.patch
I think I can make it cleaner though :-)
Should be fixed by https://reviews.llvm.org/D69793
Pavel,
Are these OK to merge to 9.0.1:
https://reviews.llvm.org/rGdf3ae1eb296d5193232649b5f282dfc4f01ba61f
https://reviews.llvm.org/rG9357b5d08497326a1895cab6c1d712bf12a34519
(In reply to Tom Stellard from comment #15)
> Pavel,
>
> Are these OK to merge to 9.0.1:
>
> https://reviews.llvm.org/rGdf3ae1eb296d5193232649b5f282dfc4f01ba61f
> https://reviews.llvm.org/rG9357b5d08497326a1895cab6c1d712bf12a34519
The change is slightly risky, but it has been sitting in the tree for over a
week, so it hopefully won't break in a spectacular way. Also, the problem it is
fixing is pretty important too.
I'm not sure what's our bar for 9.0.1 at this point, but I would be inclined to
cherry-pick it.
Cherry-picked and commited in release/9.x as :
186c848d84f1a4e7cd5b217ff3ae26083e2fdedf
6d7bc6033d7d3e452eeb181156e6c1bc2b14b897