wlav / cppyy

Other
407 stars 42 forks source link

cppexec terminates process #32

Closed saraedum closed 2 months ago

saraedum commented 2 years ago

Some invalid code terminates the Python process when running through cppyy.cppexec. I guess the following should just produce an exception but not actually crash?

>>> import cppyy
>>> cppyy.cppexec('a + 1')
terminate called after throwing an instance of 'cling::CompilationException'
  what():  Error evaluating expression (a + 1)
 Generating stack trace...
/tmp/ruth/micromamba/envs/arbxx-build/bin/addr2line: DWARF error: section .debug_info is larger than its filesize! (0x9bd663 vs 0x526480)
 0x00007fc95ca3b535 in abort + 0x121 from /lib/x86_64-linux-gnu/libc.so.6
 0x00007fc95c0ebfac in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1634095553113/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95c0ea56c in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95c0ea5be in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95c0ea7b3 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1634095553113/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:100 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95983d3e9 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/site-packages/cppyy_backend/lib/libCling.so
 0x00007fc9599f8b30 in cling::runtime::internal::EvaluateDynamicExpression(cling::Interpreter*, cling::runtime::internal::DynamicExprInfo*, clang::DeclContext*) + 0x1d0 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/site-packages/cppyy_backend/lib/libCling.so
 0x00007fc95c2720a9 in <unknown function>
 0x00007fc959a7ca77 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/site-packages/cppyy_backend/lib/libCling.so
 Generating stack trace...
/tmp/ruth/micromamba/envs/arbxx-build/bin/addr2line: DWARF error: section .debug_info is larger than its filesize! (0x9bd663 vs 0x526480)
 0x00007fc95ca3b535 in abort + 0x121 from /lib/x86_64-linux-gnu/libc.so.6
 0x00007fc95c0ebfac in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1634095553113/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95c0ea56c in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95c0ea5be in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95c0ea7b3 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1634095553113/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:100 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/lib-dynload/../../libstdc++.so
 0x00007fc95983d3e9 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/site-packages/cppyy_backend/lib/libCling.so
 0x00007fc9599f8b30 in cling::runtime::internal::EvaluateDynamicExpression(cling::Interpreter*, cling::runtime::internal::DynamicExprInfo*, clang::DeclContext*) + 0x1d0 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/site-packages/cppyy_backend/lib/libCling.so
 0x00007fc95c2720a9 in <unknown function>
 0x00007fc959a7ca77 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.10/site-packages/cppyy_backend/lib/libCling.so

This is the latest cppyy 2.2.0 as distributed on conda-forge for linux-64.

saraedum commented 2 years ago

Downgrading to cppyy 1.9.5 fixes the issue for me.

wlav commented 2 years ago

Yes, that C++ exception should have been caught (with pip; 2.2.0; Linux):

>>> import cppyy
>>> cppyy.cppexec('a + 1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wlav/cppyy-dev/cppyy/python/cppyy/__init__.py", line 211, in cppexec
    raise SyntaxError('Failed to parse the given C++ code%s' % err.err)
SyntaxError: Failed to parse the given C++ code
input_line_18:2:2: error: unexpected namespace name 'a':
      expected expression
 a + 1;
 ^

>>> 
saraedum commented 2 years ago

Still an issue with 2.3.0 btw.

saraedum commented 2 years ago

Strangely, this is specific to cppexec but woks fine with cppdef. I wonder what's so different about these two.

saraedum commented 2 years ago

I see a similar crash when an operator throws:

In [1]: import cppyy

In [2]: cppyy.cppdef('''
   ...: class X{};
   ...: X x;
   ...: X operator+(const X&lhs, const X&rhs) { throw std::logic_error("not implemented"); }
   ...: ''')
Out[2]: True

In [3]: cppyy.gbl.x + cppyy.gbl.x # works
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 cppyy.gbl.x + cppyy.gbl.x

TypeError: none of the 2 overloaded methods succeeded. Full details:
  X ::operator+(const X& lhs, const X& rhs) =>
    logic_error: not implemented
  X ::operator+(const X& lhs, const X& rhs) =>
    logic_error: not implemented

In [4]: cppyy.cppexec('x+x;') # crashes
terminate called after throwing an instance of 'std::logic_error'
  what():  not implemented
wlav commented 2 years ago

Yes, cppdef() and cppexec() are not the same thing underneath. The former is equal to #include "some_header.h" with the text of some_header.h as argument to cppdef(). It is parsed and expected to be compliant C++. OTOH, cppexec() is executed like it would (not quite) on the Cling REPL, so it can access pre-existing state and there are a couple of "niceties" for interactive convenience.

Still remains that I can't reproduce this (neither PyPI nor conda version) ...

One difference in terms of exception handling is that cppdef() only catches exceptions in CPyCppyy, but cppexec() also has a try/except block in cppyy-cling. Can't think of a reason why that should matter, but obviously something does.

Can you try the following (this is a simpler access to Cling evaluate(), with no pre-processing of the line and no extra try/except block):

import cppyy
import ctypes
import sys

cppyy.cppdef('''
class X{};
X x;    
X operator+(const X&lhs, const X&rhs) { throw std::logic_error("not implemented"); }
''')

def cppexec_with_calc(stmt):
    """Execute C++ statement <stmt> in Cling's global scope."""
    if stmt and stmt[-1] != ';':
        stmt += ';'

  # capture stderr, but note that ProcessLine could legitimately be writing to
  # std::cerr, in which case the captured output needs to be printed as normal
    with cppyy._stderr_capture() as err:
        errcode = ctypes.c_uint(0)
        cppyy.gbl.gInterpreter.Calc(stmt, ctypes.pointer(errcode))

    if errcode.value: 
        raise SyntaxError('Failed to parse the given C++ code%s' % err.err)
    elif err.err and err.err[1:] != '\n':
        sys.stderr.write(err.err[1:])

    return True

cppyy.cppexec = cppexec_with_calc

cppyy.cppexec('x+x;')
saraedum commented 2 years ago

Sorry for the late reply. I get the same with the above:

In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:import cppyy
import ctypes
import sys

cppyy.cppdef('''
class X{};
X x;
X operator+(const X&lhs, const X&rhs) { throw std::logic_error("not implemented"); }
''')

def cppexec_with_calc(stmt):
    """Execute C++ statement <stmt:> in Cling's global scope."""
    if stmt and stmt[-1] != ';':
        stmt += ';'

  # capture stderr, but note that ProcessLine could legitimately be writing to
  # std::cerr, in which case the captured output needs to be printed as: normal
    with cppyy._stderr_capture() as err:
        errcode = ctypes.c_uint(0)
        cppyy.gbl.gInterpreter.Calc(stmt, ctypes.pointer(errcode))

    if errcode.value::
        raise SyntaxError('Failed to parse the given C++ code%s' % err.err)
    elif err.err and err.err[1:] != '\n':
        sys.stderr.write(err.err[1:])

    return True
:
cppyy.cppexec = cppexec_with_calc

cppyy.cppexec('x+x;')::::::::::::::::::::::::::
:--
ERROR! Session/line number was not unique in database. History logging moved to new session 320
terminate called after throwing an instance of 'std::logic_error'
  what():  not implemented
 Generating stack trace...
/tmp/ruth/micromamba/envs/arbxx-build/bin/addr2line: DWARF error: section .debug_info is larger than its filesize! (0x9bd1aa vs 0x525748)
 0x00007feed7ad4535 in abort + 0x121 from /lib/x86_64-linux-gnu/libc.so.6
 0x00007feed4386fac in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1650668893531/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed438556c in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed43855be in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed43857b3 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1650668893531/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:100 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed408d0a7 in <unknown function>
 0x00007feecdcea537 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 Generating stack trace...
/tmp/ruth/micromamba/envs/arbxx-build/bin/addr2line: DWARF error: section .debug_info is larger than its filesize! (0x9bd1aa vs 0x525748)
 0x00007feed7ad4535 in abort + 0x121 from /lib/x86_64-linux-gnu/libc.so.6
 0x00007feed4386fac in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1650668893531/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed438556c in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed43855be in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed43857b3 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1650668893531/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:100 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007feed408d0a7 in <unknown function>
 0x00007feecdcea537 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
saraedum commented 2 years ago

It's puzzling that you cannot reproduce this. I get this on a recent ArchLinux and on an somewhat old Debian system. Let me try to create a reproducer in a docker image or something reproducible like that.

saraedum commented 2 years ago

Here's a docker reproducer (though with a different error):

$ docker run -it condaforge/mambaforge:4.12.0-0
# mamba install cppyy=2.3.0
# python
>>> import cppyy
>>> cppyy.cppexec('a + 1')
terminate called after throwing an instance of 'cling::CompilationException'
  what():  Error evaluating expression (a + 1)
 0x00007f3ce9923859 in abort + 0x12b from /lib/x86_64-linux-gnu/libc.so.6
 0x00007f3ce930bfac in _ZN9__gnu_cxx27__verbose_terminate_handlerEv + 0xc0 from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce930a56c in <unknown> from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce930a5be in <unknown> from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce930a7b3 in __cxa_rethrow + 0x0 from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce7324399 in <unknown> from /opt/conda/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f3ce74bbee0 in _ZN5cling7runtime8internal25EvaluateDynamicExpressionEPNS_11InterpreterEPNS1_15DynamicExprInfoEPN5clang11DeclContextE + 0x190 from /opt/conda/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f3ce94070a9 in <unknown function>
 0x00007f3ce752c537 in <unknown> from /opt/conda/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f3ce9923859 in abort + 0x12b from /lib/x86_64-linux-gnu/libc.so.6
 0x00007f3ce930bfac in _ZN9__gnu_cxx27__verbose_terminate_handlerEv + 0xc0 from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce930a56c in <unknown> from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce930a5be in <unknown> from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce930a7b3 in __cxa_rethrow + 0x0 from /opt/conda/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f3ce7324399 in <unknown> from /opt/conda/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f3ce74bbee0 in _ZN5cling7runtime8internal25EvaluateDynamicExpressionEPNS_11InterpreterEPNS1_15DynamicExprInfoEPN5clang11DeclContextE + 0x190 from /opt/conda/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f3ce94070a9 in <unknown function>
 0x00007f3ce752c537 in <unknown> from /opt/conda/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so

Note that you can mamba install gdb and then gdb python to get a debugger.

saraedum commented 2 years ago

I tried to bisect a bit with different versions of cppyy:

$ docker run -it condaforge/mambaforge:4.12.0-0
$$ mamba install -y cppyy'=VERSION' cxx-compiler c-compiler
# in older versions the include paths are not set correctly by conda-forge, so we need to deactivate and activate again
$$ conda deactivate
$$ conda activate
$$ python
>>> import cppyy
>>> cppyy.cppexec('a + 1')

I found that this works with 1.9.5 and fails with all the following versions 2.0.0+.

The difference between these environments is:

< cppyy                     1.9.5            py39ha3ed2ce_0    conda-forge
< cppyy-backend             1.14.3           py39h1a9c180_0    conda-forge
< cppyy-cling               6.21.6           py39h0f9e12e_0    conda-forge
< cpycppyy                  1.12.4           py39h1a9c180_0    conda-forge
---
> cppyy                     2.0.0            py39ha3ed2ce_0    conda-forge
> cppyy-backend             1.14.5           py39h1a9c180_1    conda-forge
> cppyy-cling               6.25.0           py39h0f9e12e_0    conda-forge
> cpycppyy                  1.12.6           py39h1a9c180_1    conda-forge
> libllvm9                  9.0.1           default_hc23dcda_7    conda-forge

Note that since 2.0.0 we are also depending on conda-forge's libllvm9. Swapping out libllvm9 for a version with some cling patches does not change anything here.

Which llvm patches is cppyy-cling using when you build for PyPI?

wlav commented 2 years ago

It's puzzling that you cannot reproduce this.

Not with the PyPI install that is. I haven't tried the conda install. Bit strapped for time: I want/need to show good progress with the Numba integration, so that has priority currently.

saraedum commented 2 years ago

No worries.

I had understood that you had no reproducer from that comment.

Still remains that I can't reproduce this (neither PyPI nor conda version) ...

I am happy to look into this further on my own. Could you point me to the patches that you are applying to the llvm that comes with cppyy on PyPI?

wlav commented 2 years ago

Then I must have checked it after all ...

I don't think it's in LLVM/Clang, though (all patches are here: https://github.com/wlav/cppyy-backend/tree/master/cling/patches). At least not on Linux (changes on Windows are more invasive, in particular, yes, to support capturing exceptions (or rather RTTI in general)).

There's this recent bug report: https://github.com/wlav/cppyy/issues/59. Basically, the wrong (older) libstdc++.so can be loaded in the way that I'm using ctypes.CDLL('libstdc++.so', ctypes.RTLD_GLOBAL) in _stdcpp_fix.py.

One scenario that I think could be the cause is if there are 2 typeinfo objects of std::logic_error, which I guess could happen if multiple libstdc++.so are present. That would also be something more likely to happen with conda, which brings in its own.

Should be simple to test by running with the envar LD_DEBUG=files. It will print from where libstdc++.so is pulled in.

saraedum commented 2 years ago

Ok. I'll have a look at these patches and see what we should be using for our llvm as well; there's only really these two (clang_printing.diff, explicit_templates.diff) and they don't seem to be related.

Thanks for the hint with the libstdc++.so. But that doesn't seem to be it. There's only a single libstdc++.so present in the docker reproducer above (the system has no libstdc++, there's only the conda one.)

saraedum commented 2 years ago

For what it's worth, this works with conda-forge's root:

>>> ROOT.TPython.Exec('a+1')
Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'a' is not defined

I am not sure if these are really related that much though.

wlav commented 2 years ago

TPython::Exec runs the Python interpreter, so no C++ exceptions involved in the above.

wlav commented 2 years ago

I tried the docker image above: building from source fixes it. I'll have a look to see whether I can figure out if it's specific to the LLVM used or the something related to the system.

saraedum commented 2 years ago

fwiw, our cling build does not seem to have that issue:

(base) root@a36dad40e7bd:/# cat crash.cc
a + 1;
(base) root@a36dad40e7bd:/# cling

****************** CLING ******************
* Type C++ code and press enter to run it *
*             Type .q to exit             *
*******************************************
[cling]$ a + 1
input_line_3:2:2: error: use of undeclared identifier 'a'
 a + 1
 ^
[cling]$ .x crash.cc
In file included from input_line_4:1:
/crash.cc:1:1: error: unknown type name 'a'
a + 1;
^
/crash.cc:1:3: error: expected unqualified-id
a + 1;
  ^
[cling]$

But maybe this always goes through the working code path that cppdef uses?

saraedum commented 2 years ago

I tried to rebuild cppyy-cling with -Og -g3 to understand where this exception is not being caught but unfortunately, I cannot get a good backtrace in gdb:

Catchpoint 1 (exception thrown), __cxxabiv1::__cxa_throw (obj=0x5635ca905910, tinfo=0x7fef07fbc918 <typeinfo for cling::CompilationException>, dest=0x7fef067e82a0 <cling::InterpreterException::~InterpreterException()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:80
80  ../../../../libstdc++-v3/libsupc++/eh_throw.cc: No such file or directory.
(gdb) where
#0  __cxxabiv1::__cxa_throw (obj=0x5635ca905910, tinfo=0x7fef07fbc918 <typeinfo for cling::CompilationException>,
    dest=0x7fef067e82a0 <cling::InterpreterException::~InterpreterException()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:80
#1  0x00007fef06653bed in cling::CompilationException::throwingHandler (reason=...)
    at /usr/local/src/conda/cppyy-cling-6.25.3/src/interpreter/cling/lib/Interpreter/Exception.cpp:114
#2  0x00007fef067efa42 in cling::runtime::internal::EvaluateDynamicExpression (interp=<optimized out>, DEI=<optimized out>, DC=<optimized out>)
    at /usr/local/src/conda/cppyy-cling-6.25.3/src/interpreter/cling/lib/Interpreter/Interpreter.cpp:1818
#3  0x00007fef087520a9 in ?? ()
#4  0x0000000000000000 in ?? ()

Could you tell me where this exception is supposed to be handled in the code? @wlav

wlav commented 2 years ago

Well, I tried every comparison I could think of (mismatched vtables, exception typeinfo symbols not exported/imported, RTTI or EH disabled, C++ library mismatches, etc., etc.) and compared it to my dev environment. There's nothing that stands out as either wrong or even different.

So instead, I tried to reproduce it in my dev environment, by trying to get the code to break in the same, as maybe that would provide a clue.

The reality turns out that the pip installed version follows a decidedly different code path: it's not about it catching the exception, but rather that that code never throws.

This should have connected before, but I just didn't see it: the exception that terminates the process says Error evaluating expression (a + 1) but the error that I get from Cling when using the pip install is input_line_19:2:4: error: unexpected namespace name 'a': expected expression.

At this point, my best guess is a difference in initialization order, but so far some of the thing I looked into (such as setting of install_fatal_error_handler) didn't point to anything obvious.

Still digging ...

wlav commented 2 years ago

So not so much a different code path, just that in the docker image, that build will run EvaluateTSynthesizer::Initialize as part of EvaluateTSynthesizer::Transform after parsing failed, but the pip installed version does not. It's a transform that tries to figure out if a is some internal or conventional Cling symbol; it patches up the expression on success, and throws on failure.

But it only does that on decls that are annotated with the "__ResolveAtRuntime" string. That's not the case in the pip installed version, but I'm not seeing it in the docker install either, so still looking as to how it reaches that point.

wlav commented 2 years ago

Actually, there's another one that annotates decls with "__ResolveAtRuntime": CppyyLegacy::TClingCallbacks::tryResolveAtRuntimeInternal. (I was searching in the Cling sources only before, so I didn't see it earlier.) And that callback is indeed only called in the docker case, not in the pip install case.

Since I can't recompile the docker case (once done, it won't have the exception anymore), it's hard to pin down for certain where the problem comes from, but the call in the pip case is blocked by topmostDCIsFunction() returning false, meaning that the scope in which a is search is not the top-most, or interactive/macro, scope. In the docker case, that call succeeds. Indeed, if I do:

cppyy.cppexec("void func() {a+1;}")

then that will produce a correct error message in both setups, as the exception-producing pass is never called.

Still leaves me puzzled why and how that difference comes about.

wlav commented 2 years ago

The above also suggests a workaround:

>>> import cppyy
>>> def __cling_workaround(line, call=cppyy.cppexec):
...     call("namespace __cling_workaround{"+line+";}")
... 
>>> cppyy.cppdef("namespace __cling_workaround{}; using namespace __cling_workaround;")
True
>>> cppyy.cppexec = __cling_workaround
>>> cppyy.cppexec("a+1")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in __cling_workaround
  File "/opt/conda/lib/python3.9/site-packages/cppyy/__init__.py", line 239, in cppexec
    raise SyntaxError('Failed to parse the given C++ code%s' % err.err)
SyntaxError: Failed to parse the given C++ code
input_line_20:1:30: error: unknown type name 'a'
namespace __cling_workaround{a+1;};
                             ^
input_line_20:1:31: error: expected unqualified-id
namespace __cling_workaround{a+1;};
                              ^
>>> cppyy.cppexec("int b=45")
>>> cppyy.gbl.b
45
>>> 

I'm going to ping upstream and see whether they know how this difference in top scope treatment can possibly come about.

saraedum commented 2 years ago

Thanks so much for looking into this :)

If you want to me to provide a conda package with some patch applied for testing in the docker setup, please let me know.

wlav commented 2 years ago

In terms of trying a patch, the only thing I can think of that would determine whether there's some type of linker problem is to change this line: https://github.com/wlav/cppyy-backend/blob/master/cling/src/core/metacling/src/TCling.cxx#L2489 to catch (...). E.g. this:

diff --git a/cling/src/core/metacling/src/TCling.cxx b/cling/src/core/metacling/src/TCling.cxx
index 9efb618..6bc1e80 100644
--- a/cling/src/core/metacling/src/TCling.cxx
+++ b/cling/src/core/metacling/src/TCling.cxx
@@ -17,6 +17,8 @@ Cling is a full ANSI compliant C++-11 interpreter based on
 clang/LLVM technology.
 */

+#include <iostream>
+
 #include "TCling.h"

 #include "ROOT/FoundationUtils.hxx"
@@ -2492,6 +2494,14 @@ static int HandleInterpreterException(cling::MetaProcessor* metaProcessor,
       ex.diagnose();
       compRes = cling::Interpreter::kFailure;
    }
+   catch (std::exception& ex) {
+      std::cerr << "compilation failed: " << ex.what() << std::endl;
+      compRes = cling::Interpreter::kFailure;
+   }
+   catch (...) {
+      std::cerr << "compilation failed with unknown error" << std::endl;
+      compRes = cling::Interpreter::kFailure;
+   }
    return 0;
 }
wlav commented 2 years ago

I figured I might as well commit the patch above. In normal operation neither one of the two new catch blocks will ever execute, but it might help here.

No news from upstream, but they mentioned that it's possible to switch off dynamic lookups. However, when I do that, some test cases segfault somewhere deep inside clang, so I don't think that mode of operation is well tested enough.

wlav commented 2 years ago

I find that dropping the last stage of the dynamic lookup, that is: tryResolveAtRuntimeInternal, which marks the missing symbol as something to be created by the EvaluateTSynthesizer pass, has no use for cppyy and so I'm dropping that. (As I'm told, upstream has this for e.g. automatically pulling objects from open files, but since that only works at the global scope, it makes little sense in Python either way.)

This also helps on M1 as EvaluateTSynthesizer runs JITed code, so that exception isn't caught (and my new workaround for exceptions on ARM does not preserve the exception type for some reason in this case; it does otherwise); and on Windows, where no exception is ever thrown (is macro-d out by the preprocessor), with simply an incorrect "success" return. So although I'd much rather have the exception catching problem resolved, given that it's problematic on those two other platforms as well, not throwing is still the best.

(I did look into passing out an error code from EvaluateT but that's far more involved, as the return type is already used and that's the only part passed out from running the JITed code in executeWrapper, so I'm punting on that.)

Fix is in repo. I'm leaving the extra catch(...) mentioned above in place, too.

saraedum commented 2 years ago

I applied the patch. It does not make a difference it seems:

>>> import cppyy
(Re-)building pre-compiled headers (options: -O2 -mavx); this may take a minute ...
>>> cppyy.cppexec('a + 1')
terminate called after throwing an instance of 'cling::CompilationException'
  what():  Error evaluating expression (a + 1)
 Generating stack trace...
 0x00007f96cdf33838 in raise + 0x18 from /usr/lib/libc.so.6
 0x00007f96cdf1d535 in abort + 0xcf from /usr/lib/libc.so.6
 0x00007f96cd61d036 in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cd61b524 in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cd61b576 in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cd61b768 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:103 from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cb64853f in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f96cb7dff70 in cling::runtime::internal::EvaluateDynamicExpression(cling::Interpreter*, cling::runtime::internal::DynamicExprInfo*, clang::DeclContext*) + 0x190 from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f96ce20b0a9 in <unknown function>
 0x00007f96cb8505c7 in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 Generating stack trace...
 0x00007f96cdf33838 in raise + 0x18 from /usr/lib/libc.so.6
 0x00007f96cdf1d535 in abort + 0xcf from /usr/lib/libc.so.6
 0x00007f96cd61d036 in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cd61b524 in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cd61b576 in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cd61b768 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:103 from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007f96cb64853f in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f96cb7dff70 in cling::runtime::internal::EvaluateDynamicExpression(cling::Interpreter*, cling::runtime::internal::DynamicExprInfo*, clang::DeclContext*) + 0x190 from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 0x00007f96ce20b0a9 in <unknown function>
 0x00007f96cb8505c7 in <unknown> from /home/jule/proj/umamba/envs/flatsurf/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so

To clarify, that's the catch(...) patch. The one that we did not really expect to work anyway.

wlav commented 2 years ago

Yes; saw the same behavior on M1. The removal of tryResolveAtRuntimeInternal will work. :)

saraedum commented 2 years ago

Great. It's fixed in conda-forge now.

saraedum commented 2 years ago

While the above problem is fixed, cppexec still crashes in other instances for me:

import cppyy
cppyy.cppdef(r'''
void f(){ throw std::logic_error(":("); }
''')
cppyy.cppexec(r'''
f();
''')

crashes on the latest cppyy from conda-forge with

terminate called after throwing an instance of 'std::logic_error'
  what():  :(
 Generating stack trace...
/tmp/ruth/micromamba/envs/arbxx-build/bin/addr2line: DWARF error: section .debug_info is larger than its filesize! (0x9bd1aa vs 0x525748)
 0x00007fa7bc28f535 in abort + 0x121 from /lib/x86_64-linux-gnu/libc.so.6
 0x00007fa7bbaf2036 in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7bbaf0524 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7bbaf0576 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7bbaf0768 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:103 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7b1c8d097 in <unknown function>
 0x00007fa7b9498c67 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so
 Generating stack trace...
/tmp/ruth/micromamba/envs/arbxx-build/bin/addr2line: DWARF error: section .debug_info is larger than its filesize! (0x9bd1aa vs 0x525748)
 0x00007fa7bc28f535 in abort + 0x121 from /lib/x86_64-linux-gnu/libc.so.6
 0x00007fa7bbaf2036 in __gnu_cxx::__verbose_terminate_handler() at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:95 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7bbaf0524 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7bbaf0576 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7bbaf0768 in __cxa_rethrow at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1652324151713/work/build/x86_64-conda-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_throw.cc:103 from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/lib-dynload/../../libstdc++.so
 0x00007fa7b1c8d097 in <unknown function>
 0x00007fa7b9498c67 in <unknown> from /tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/site-packages/cppyy_backend/lib/libCling.so

This was on Linux with these packages:

cppyy                     2.4.0            py39hd14de60_0    conda-forge
cppyy-backend             1.14.9           py39hf939315_0    conda-forge
cppyy-cling               6.27.0           py39h3d66fe8_0    conda-forge
cpycppyy                  1.12.11          py39hf939315_0    conda-forge

With the cppyy from PyPI I get:

Traceback (most recent call last):
  File "/dev/shm/ruth/arbxx-build/libarbxx/crash.py", line 5, in <module>
    cppyy.cppexec(r'''
  File "/tmp/ruth/micromamba/envs/arbxx-build/lib/python3.9/site-packages/cppyy/__init__.py", line 221, in cppexec
    raise SyntaxError('Failed to parse the given C++ code%s' % err.err)
SyntaxError: Failed to parse the given C++ code
compilation failed: :(
saraedum commented 2 years ago

When I call f() normally, there is no problem:

import cppyy
cppyy.cppdef(r'''
void f(){ throw std::logic_error(":("); }
''')
cppyy.gbl.f()

prints (as expected)

Traceback (most recent call last):
  File "/dev/shm/ruth/arbxx-build/libarbxx/crash.py", line 5, in <module>
    cppyy.gbl.f()
cppyy.gbl.std.logic_error: void ::f() =>
    logic_error: :(

so, again, only cppexec seems to be problematic somehow.

wlav commented 2 years ago

Yes, figured: the problem in the original case was an exception thrown from JITed code (the dynamic lookup JITs a function which fails and throws), so this is the same thing.

saraedum commented 2 years ago

Since there has not been a cppyy release in some time. Should the latest problem also be fixed in the repo already?

wlav commented 2 years ago

I haven't worked on this recently...

As for a release, I'm sitting on a pull request for better Numba support. Total lack of time over the past couple of months b/c of conference, COVID, and the start of new projects. I'm starting to think I should just push a 2.4.2 release instead of 2.5.0 for Numba.

saraedum commented 1 year ago

No worries. Is there anything I can do to get a fix for this into the next release? I wouldn't mind trying to investigate what's going on myself but I am mostly lost in the cppyy/cling source code so I would need some pointers where to start.

saraedum commented 1 year ago

A possible workaround is to set set_signals_as_exception:

In [1]: import cppyy.ll

In [2]: cppyy.ll.set_signals_as_exception(True)
Out[2]: False

In [3]: import cppyy
   ...: cppyy.cppdef(r'''
   ...: void f(){ throw std::logic_error(":("); }
   ...: ''')
   ...: cppyy.cppexec(r'''
   ...: f();
   ...: ''')
terminate called after throwing an instance of 'std::logic_error'
  what():  :(
long CppyyLegacy::TInterpreter::ProcessLine(const char* line, int* error = 0) =>
    AbortSignal: abort from C++; program state was reset

Traceback (most recent call last):

  File ~/proj/umamba/envs/flatsurf/lib/python3.10/site-packages/IPython/core/interactiveshell.py:3433 in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In[3], line 5
    cppyy.cppexec(r'''

  File ~/proj/umamba/envs/flatsurf/lib/python3.10/site-packages/cppyy/__init__.py:249 in cppexec
    raise SyntaxError('Failed to parse the given C++ code%s' % err.err)

  File <string>
SyntaxError: Failed to parse the given C++ code
 *** Break *** abort
wlav commented 1 year ago

The latter is precisely what I have as a workaround implementation. :) It may be that I'm missing another location, where Cling is doing something similar: JITing something to create an on-the-fly lookup. Remains that for general use, there should be any (and that exceptions should work for JITed code, or at least for Linux/Intel).

My current hope is that LLVM13, which is almost ready, will solve the problem, but I haven't tested any of it yet.

saraedum commented 2 months ago

Seems to be fixed in 3.1.2 (maybe also fixed in earlier versions but that's the one I tested with.)