Open inducer opened 6 years ago
If you're logged in remotely, make sure your user account is in the "video" group. Otherwise you don't have access to required graphics driver device nodes. This is currently handled poorly in the Thunk.
Thanks for the heads-up. FWIW, I'm in the situation that I run a small cluster of scientific computing research machines, and this issue crashes unrelated code using MPI from users that don't use OpenCL directly. (because OpenMPI uses hwloc, and hwloc appears to enumerate OpenCL devices) I would like to offer AMD compute as a capability, but this makes it kind of hard.
I'm not sure if it's the same bug, but I have SIGSEGV on clGetPlatformIDs.
@fxkamd I straced the crashing program. It successfully opens /dev/kfd and /dev/dri/renderD128. There are no EACCES nor EPERM on any syscall. I conclude that this is unrelated to groups and permissions.
Stack trace:
0x0000000000000000
__gthread_create
std::thread::_M_start_thread
std::thread::thread<(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(unsigned int)::<lambda()> >
(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor
(anonymous namespace)::Executor::getDefaultExecutor
llvm::parallel::detail::TaskGroup::spawn(std::function<void ()>)
llvm::parallel::detail::parallel_for_each<__gnu_cxx::__normal_iterator<lld::elf::InputSectionBase**, std::vector<lld::elf::InputSectionBase*, std::allocator<lld::elf::InputSectionBase*> > >, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1}>(__gnu_cxx::__normal_iterator<lld::elf::InputSectionBase**, std::vector<lld::elf::InputSectionBase*, std::allocator<lld::elf::InputSectionBase*> > >, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1}, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1})
llvm::parallel::for_each<__gnu_cxx::__normal_iterator<lld::elf::InputSectionBase**, std::vector<lld::elf::InputSectionBase*, std::allocator<lld::elf::InputSectionBase*> > >, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1}>(llvm::parallel::parallel_execution_policy, __gnu_cxx::__normal_iterator<lld::elf::InputSectionBase**, std::vector<lld::elf::InputSectionBase*, std::allocator<lld::elf::InputSectionBase*> > >, llvm::parallel::parallel_execution_policy, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1})
lld::parallelForEach<std::vector<lld::elf::InputSectionBase*, std::allocator<lld::elf::InputSectionBase*> >&, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1}>(std::vector<lld::elf::InputSectionBase*, std::allocator<lld::elf::InputSectionBase*> >&, lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >()::{lambda(lld::elf::InputSectionBase*)#1})
lld::elf::splitSections<llvm::object::ELFType<(llvm::support::endianness)1, true> >
lld::elf::LinkerDriver::link<llvm::object::ELFType<(llvm::support::endianness)1, true> >
lld::elf::LinkerDriver::main
lld::elf::link
amd::opencl_driver::AMDGPUCompiler::CompileAndLinkExecutable
amd::opencl_driver::AMDGPUCompiler::CompileAndLinkExecutable
amd::CacheCompilation::compileAndLinkExecutable
device::Program::linkImplLC
device::Program::build
amd::Program::build
amd::Device::BlitProgram::create
roc::Device::create
roc::Device::init
amd::Device::init
amd::Runtime::init
clIcdGetPlatformIDsKHR
??
clGetPlatformIDs
It seems like linker calls to null pointer.
ROCm 2.0.0 works on the same system, so bisect may help. Although bisecting llvm is going to be painfully slow.
With the following backtrace: