MikePopoloski / slang

SystemVerilog compiler and language services
MIT License
554 stars 120 forks source link

crash in python bindings #758

Closed tdp2110 closed 1 year ago

tdp2110 commented 1 year ago

Describe the bug

Apologies for the lack of repro'ing example (I've been trying, will try some more as I have time), but I've been crashing pyslang in pure python lately. Sometimes I get a segfault, sometimes the OS detects an invalid free or memory corruption. Running in valgrind, I see the following suspicious warning, which might be enough for you to understand the bug (I don't know enough about slang's memory management to offer a fix)

==22646== Invalid free() / delete / delete[] / realloc()
==22646==    at 0x4C2B6DF: operator delete(void*, unsigned long) (vg_replace_malloc.c:595)
==22646==    by 0xD7D8927: std::default_delete<slang::ast::EnumValueSymbol>::operator()(slang::ast::EnumValueSymbol*) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD7A4A1D: std::unique_ptr<slang::ast::EnumValueSymbol, std::default_delete<slang::ast::EnumValueSymbol> >::~unique_ptr() (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD77F2B1: pybind11::class_<slang::ast::EnumValueSymbol, slang::ast::ValueSymbol>::dealloc(pybind11::detail::value_and_holder&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD4C5186: pybind11::detail::clear_instance(_object*) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD4C5294: pybind11_object_dealloc (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0x33B87B: subtype_dealloc (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x2E0CC7: dict_dealloc (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x2E575A: meth_dealloc (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x33B82C: subtype_dealloc (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x2E0B96: dict_dealloc (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x363A71: subtype_clear (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x2D68E6: gc_collect_main (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3DCF5B: _PyGC_CollectNoFail.isra.0 (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3DC4B4: finalize_modules (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3CA769: Py_FinalizeEx (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3D6954: Py_RunMain (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x399178: Py_BytesMain (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x5986554: (below main) (in /usr/lib64/libc-2.17.so)
==22646==  Address 0xcf7ae90 is 3,280 bytes inside a block of size 4,096 alloc'd
==22646==    at 0x4C2A593: operator new(unsigned long) (vg_replace_malloc.c:344)
==22646==    by 0xDEB3515: slang::BumpAllocator::allocSegment(slang::BumpAllocator::Segment*, unsigned long) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xDEB34BD: slang::BumpAllocator::allocateSlow(unsigned long, unsigned long) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD4D078B: slang::BumpAllocator::allocate(unsigned long, unsigned long) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE008812: slang::ast::FieldSymbol* slang::BumpAllocator::emplace<slang::ast::FieldSymbol, std::basic_string_view<char, std::char_traits<char> >, slang::SourceLocation, unsigned int, unsigned int>(std::basic_string_view<char, std::char_traits<char> >&&, slang::SourceLocation&&, unsigned int&&, unsigned int&&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0042D2: slang::ast::PackedStructType::fromSyntax(slang::ast::Compilation&, slang::syntax::StructUnionTypeSyntax const&, slang::ast::ASTContext const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE021838: slang::ast::Type::fromSyntax(slang::ast::Compilation&, slang::syntax::DataTypeSyntax const&, slang::ast::ASTContext const&, slang::ast::Type const*) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0589FD: slang::ast::Compilation::getType(slang::syntax::DataTypeSyntax const&, slang::ast::ASTContext const&, slang::ast::Type const*) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE00D78C: slang::ast::DeclaredType::resolveType(slang::ast::ASTContext const&, slang::ast::ASTContext const&) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE00CF21: slang::ast::DeclaredType::getType() const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0851CF: bool slang::ast::DiagnosticVisitor::handleDefault<slang::ast::TypeAliasType>(slang::ast::TypeAliasType const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE077A72: void slang::ast::DiagnosticVisitor::handle<slang::ast::TypeAliasType>(slang::ast::TypeAliasType const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE06CEB2: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visit<slang::ast::TypeAliasType>(slang::ast::TypeAliasType const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0630CB: decltype(auto) slang::ast::Symbol::visit<slang::ast::DiagnosticVisitor&>(slang::ast::DiagnosticVisitor&) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE098A76: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visitDefault<slang::ast::PackageSymbol>(slang::ast::PackageSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE085FA1: bool slang::ast::DiagnosticVisitor::handleDefault<slang::ast::PackageSymbol>(slang::ast::PackageSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE077C14: void slang::ast::DiagnosticVisitor::handle<slang::ast::PackageSymbol>(slang::ast::PackageSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE06D0EC: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visit<slang::ast::PackageSymbol>(slang::ast::PackageSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE06324B: decltype(auto) slang::ast::Symbol::visit<slang::ast::DiagnosticVisitor&>(slang::ast::DiagnosticVisitor&) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0988C6: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visitDefault<slang::ast::CompilationUnitSymbol>(slang::ast::CompilationUnitSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0854A7: bool slang::ast::DiagnosticVisitor::handleDefault<slang::ast::CompilationUnitSymbol>(slang::ast::CompilationUnitSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE077ABE: void slang::ast::DiagnosticVisitor::handle<slang::ast::CompilationUnitSymbol>(slang::ast::CompilationUnitSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE06CEFE: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visit<slang::ast::CompilationUnitSymbol>(slang::ast::CompilationUnitSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0630FB: decltype(auto) slang::ast::Symbol::visit<slang::ast::DiagnosticVisitor&>(slang::ast::DiagnosticVisitor&) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE098826: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visitDefault<slang::ast::RootSymbol>(slang::ast::RootSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE085391: bool slang::ast::DiagnosticVisitor::handleDefault<slang::ast::RootSymbol>(slang::ast::RootSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE077A98: void slang::ast::DiagnosticVisitor::handle<slang::ast::RootSymbol>(slang::ast::RootSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE06CED8: void slang::ast::ASTVisitor<slang::ast::DiagnosticVisitor, false, false>::visit<slang::ast::RootSymbol>(slang::ast::RootSymbol const&) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE0630E3: decltype(auto) slang::ast::Symbol::visit<slang::ast::DiagnosticVisitor&>(slang::ast::DiagnosticVisitor&) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE056D28: slang::ast::Compilation::getSemanticDiagnostics() (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xE05835A: slang::ast::Compilation::getAllDiagnostics() (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD6867A0: pybind11::cpp_function::cpp_function<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}::operator()(slang::ast::Compilation*) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD6BF3D8: slang::Diagnostics const& pybind11::detail::argument_loader<slang::ast::Compilation*>::call_impl<slang::Diagnostics const&, pybind11::cpp_function::cpp_function<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}&, 0ul, pybind11::detail::void_type>(pybind11::cpp_function::cpp_function<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}&, std::integer_sequence<unsigned long, 0ul>, pybind11::detail::void_type&&) && (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD6B0414: std::enable_if<!std::is_void<slang::Diagnostics const&>::value, slang::Diagnostics const&>::type pybind11::detail::argument_loader<slang::ast::Compilation*>::call<slang::Diagnostics const&, pybind11::detail::void_type, pybind11::cpp_function::cpp_function<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}&>(pybind11::cpp_function::cpp_function<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}&) && (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD6995C7: void pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}, slang::Diagnostics const&, slang::ast::Compilation*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(pybind11::cpp_function::initialize<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}&&, slang::Diagnostics const& (*)(slang::ast::Compilation*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call) const (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD69961A: void pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}, slang::Diagnostics const&, slang::ast::Compilation*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(pybind11::cpp_function::initialize<slang::Diagnostics const&, slang::ast::Compilation, , pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::return_value_policy>(slang::Diagnostics const& (slang::ast::Compilation::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(slang::ast::Compilation*)#1}&&, slang::Diagnostics const& (*)(slang::ast::Compilation*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0xD4CA041: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/tpeters/projects/pyslang/slang/build/lib64/pyslang.cpython-311-x86_64-linux-gnu.so)
==22646==    by 0x3053C6: cfunction_call (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x2E1D33: _PyObject_MakeTpCall (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x2EE385: _PyEval_EvalFrameDefault (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3AB42D: _PyEval_Vector (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3AAA8E: PyEval_EvalCode (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3CBE0B: run_eval_code_obj (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3C8323: run_mod (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3DD321: pyrun_file (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3DCC54: _PyRun_SimpleFileObject (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3DCA22: _PyRun_AnyFileObject (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x3D6AB5: Py_RunMain (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x399178: Py_BytesMain (in /home/tpeters/miniconda3/envs/test/bin/python3.11)
==22646==    by 0x5986554: (below main) (in /usr/lib64/libc-2.17.so)

Looks like the diagnostics engine allocates a symbol from inside a block, then python tries to deallocate it using free from the interior of the block (an invalid free). I'm surprised I've had such trouble creating an example based on this.

To Reproduce

Unfortunately I don't have a small repro I can share, but the code roughly takes the following form:

import gc
import pyslang
import sys

class Foo:
    def __init__(self):
        self.asttop = None

    def process_files(self, files):
        comp = pyslang.Compilation()
        for fname in files:
            cst = pyslang.SyntaxTree.fromFile(fname)
            if not cst:
                raise RuntimeError(
                    f"pyslang could not compute syntax tree fromfile {fname}"
                )
            comp.addSyntaxTree(cst)
        astroot = comp.getRoot()
        self.asttop = astroot.topInstances
        diags = comp.getAllDiagnostics()
        errors = [d for d in diags if d.isError()]
        report = pyslang.DiagnosticEngine.reportAll(comp.sourceManager, diags)
        if report:
            print(report, file=sys.stderr)

def main():
    foo = Foo()
    foo.process_files(sys.argv[1:])

    # some analysis of the foo.asttop, putting pieces into data structures

if __name__ == '__main__':
    main()

Additional context

I "fixed" my problem by hacking pyslang to just leak everything 🤓

tdp2110 commented 1 year ago

https://pybind11.readthedocs.io/en/stable/advanced/functions.html#return-value-policies might be relevant here.

MikePopoloski commented 1 year ago

This is running with a local build of slang? So a fairly recent revision? I wonder if you're building now with mimalloc and that's incompatible with the way Python handles allocations?

tdp2110 commented 1 year ago

This is running with a local build of slang? So a fairly recent revision? I wonder if you're building now with mimalloc and that's incompatible with the way Python handles allocations?

Local build indeed, head at d27d9ac (Fix noexcept build, Mon May 15 20:54:46 2023 -0400). I tried with both mimallc and without (and lots of other build configs, which led to https://github.com/MikePopoloski/slang/pull/756) and I believe I was crashing them all.

MikePopoloski commented 1 year ago

Hmm, ok. It's certainly possible this is a bug in the binding code, the return value policies (which you linked to) are easy to get wrong. Without a more accurate reproducer though it's going to be tough to guess where it is; the call stack you listed pointed to EnumValueSymbols so I can look at methods that return those but otherwise I won't have much to go on.

MikePopoloski commented 1 year ago

I did an audit and fixed a few return value policies that seemed wrong in af7827decd2be7c8256513fb6d3703ffb9b8adb9 but I don't know if it will help your specific problem.

tdp2110 commented 1 year ago

Awesome, thanks! I’ll try on Tuesday. Have a great weekend Sent from my iPhoneOn May 20, 2023, at 10:05, Michael Popoloski @.***> wrote: I did an audit and fixed a few return value policies that seemed wrong in af7827d but I don't know if it will help your specific problem.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

tdp2110 commented 1 year ago

https://github.com/MikePopoloski/slang/commit/af7827decd2be7c8256513fb6d3703ffb9b8adb9 removes the crash and the memcheck warning!

Thanks again, I really appreciate your effort!