daphne-eu / daphne

DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines
Apache License 2.0
67 stars 62 forks source link

seg fault on 'clone' sys call when using vectorized engine #884

Closed m-birke closed 3 weeks ago

m-birke commented 3 weeks ago

I executed an internal script with --vec

TBD

After failing I started it with gdb and saw it running.... the next try with gdb gave this bt:

0 0x00000000016bb328 in (anonymous namespace)::VectorizeComputationsPass::runOnOperation() ()

1 0x0000000001ed02c5 in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass, mlir::Operation, mlir::AnalysisManager, bool, unsigned int) ()

2 0x0000000001ed0917 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor, mlir::PassInstrumentation::PipelineParentInfo const*) ()

3 0x0000000001ed59cc in std::_Function_handler<void (), mlir::failableParallelForEach<gnu_cxx::normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_0&>(mlir::MLIRContext, gnu_cxx::normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_0&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()

4 0x0000000001de6ae2 in std::_Function_handler<void (), llvm::ThreadPool::createTaskAndFuture(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()

5 0x0000000003dd5f57 in llvm::ThreadPool::processTasks(llvm::ThreadPoolTaskGroup*) ()

6 0x0000000003dd6c43 in void llvm::thread::ThreadProxy<std::tuple<llvm::ThreadPool::grow(int)::$_0> >(void) ()

7 0x00007ffff7ab61ca in start_thread () from /lib64/libpthread.so.0

8 0x00007ffff49c3e73 in clone () from /lib64/libc.so.6

philipportner commented 3 weeks ago

I strongly suspect this is a duplicate of #881. You could try to run this again with the changes from #882. This is not merged yet as I only fixed the problem of crashing, and have not yet fixed the root cause.

philipportner commented 3 weeks ago

closed by #882