LLVM 11.0 & Pytorch HIP: Cannot select: t14: v2f64 = extract_subvector t11, Constant:i32<2>

Quuxplusone commented 4 years ago


Bugzilla Link	PR46067
Status	NEW
Importance	P normal
Reported by	Jonathan Schrack (jmschrack@gmail.com)
Reported on	2020-05-25 12:08:42 -0700
Last modified on	2020-05-26 16:55:31 -0700
Version	trunk
Hardware	PC Linux
CC	efriedma@quicinc.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, neeilans@live.com, richard-llvm@metafoo.co.uk
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Built LLVM from Github source.
Trying to build a Hip-ified PyTorch from source.

[3878/4644] Building HIPCC object
caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_generate_proposals_op_util_nms_gpu.hip.o
FAILED:
caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_generate_proposals_op_util_nms_gpu.hip.o
cd /home/skullcrab/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip
&& /usr/bin/cmake -E make_directory
/home/skullcrab/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/.
&& /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D
generated_file:STRING=/home/skullcrab/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_generate_proposals_op_util_nms_gpu.hip.o
-P
/home/skullcrab/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_generate_proposals_op_util_nms_gpu.hip.o.cmake
LLVM ERROR: Cannot select: t14: v2f64 = extract_subvector t11, Constant:i32<2>
  t11: v4f64,ch = CopyFromReg t0, Register:v4f64 %365
    t10: v4f64 = Register %365
  t12: i32 = Constant<2>
In function:
_ZN6caffe25utils12_GLOBAL__N_116RotatedNMSKernelEPKNS0_10RotatedBoxEifiPi
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash
backtrace.
Stack dump:
0.  Program arguments: /opt/rocm-3.3.0/llvm/bin/llc
/tmp/generate_proposals_op_util_nms_gpu-45ebbd-gfx803-optimized-66393e.bc -O3 -
mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -filetype=obj -amdgpu-early-inline-
all=true -amdgpu-function-calls=false -o
/tmp/generate_proposals_op_util_nms_gpu-45ebbd-gfx803-0c705d.o
1.  Running pass 'CallGraph Pass Manager' on module
'/tmp/generate_proposals_op_util_nms_gpu-45ebbd-gfx803-optimized-66393e.bc'.
2.  Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function
'@_ZN6caffe25utils12_GLOBAL__N_116RotatedNMSKernelEPKNS0_10RotatedBoxEifiPi'
 #0 0x0000000001a92a44 PrintStackTraceSignalHandler(void*) (/opt/rocm-3.3.0/llvm/bin/llc+0x1a92a44)
 #1 0x0000000001a9052e llvm::sys::RunSignalHandlers() (/opt/rocm-3.3.0/llvm/bin/llc+0x1a9052e)
 #2 0x0000000001a92d6c SignalHandler(int) (/opt/rocm-3.3.0/llvm/bin/llc+0x1a92d6c)
 #3 0x00007fe28d07c890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)
 #4 0x00007fe28bd42e97 raise /build/glibc-OTsEL5/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 #5 0x00007fe28bd44801 abort /build/glibc-OTsEL5/glibc-2.27/stdlib/abort.c:81:0
 #6 0x0000000001a211e0 llvm::report_fatal_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (/opt/rocm-3.3.0/llvm/bin/llc+0x1a211e0)
 #7 0x0000000001a211f7 (/opt/rocm-3.3.0/llvm/bin/llc+0x1a211f7)
 #8 0x00000000018f79a2 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/opt/rocm-3.3.0/llvm/bin/llc+0x18f79a2)
 #9 0x00000000018f6bee llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/opt/rocm-3.3.0/llvm/bin/llc+0x18f6bee)
#10 0x0000000000876b6a (anonymous
namespace)::AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/opt/rocm-
3.3.0/llvm/bin/llc+0x876b6a)
#11 0x00000000018ed73e llvm::SelectionDAGISel::DoInstructionSelection()
(/opt/rocm-3.3.0/llvm/bin/llc+0x18ed73e)
#12 0x00000000018ec5da llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/opt/rocm-
3.3.0/llvm/bin/llc+0x18ec5da)
#13 0x00000000018ea1d5
llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/opt/rocm-
3.3.0/llvm/bin/llc+0x18ea1d5)
#14 0x00000000018e6666
llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/opt/rocm-
3.3.0/llvm/bin/llc+0x18e6666)
#15 0x0000000000875c54 (anonymous
namespace)::AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&)
(/opt/rocm-3.3.0/llvm/bin/llc+0x875c54)
#16 0x0000000000fe24ba
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/opt/rocm-
3.3.0/llvm/bin/llc+0xfe24ba)
#17 0x00000000013a43de llvm::FPPassManager::runOnFunction(llvm::Function&)
(/opt/rocm-3.3.0/llvm/bin/llc+0x13a43de)
#18 0x0000000000c5c22c (anonymous
namespace)::CGPassManager::runOnModule(llvm::Module&) (/opt/rocm-
3.3.0/llvm/bin/llc+0xc5c22c)
#19 0x00000000013a4daf llvm::legacy::PassManagerImpl::run(llvm::Module&)
(/opt/rocm-3.3.0/llvm/bin/llc+0x13a4daf)
#20 0x000000000067e527 main (/opt/rocm-3.3.0/llvm/bin/llc+0x67e527)
#21 0x00007fe28bd25b97 __libc_start_main /build/glibc-OTsEL5/glibc-
2.27/csu/../csu/libc-start.c:344:0
#22 0x000000000067c05a _start (/opt/rocm-3.3.0/llvm/bin/llc+0x67c05a)
clang-11: error: unable to execute command: Aborted (core dumped)
clang-11: error: amdgcn-link command failed due to signal (use -v to see
invocation)
clang version 11.0.0 (https://github.com/llvm/llvm-project.git
efa70843aa711802d57d1600d705a5bb51b4c740)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-3.3.0/llvm/bin
clang-11: note: diagnostic msg: Error generating preprocessed source(s).
CMake Error at
torch_hip_generated_generate_proposals_op_util_nms_gpu.hip.o.cmake:174
(message):
  Error generating file
  /home/skullcrab/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_generate_proposals_op_util_nms_gpu.hip.o

Quuxplusone commented 4 years ago

Missing steps to reproduce.

Quuxplusone commented 4 years ago

Jonathan - to reproduce the bug, at the very least we need the preprocessed file with its full command line, or preferably the IR file (which can be generated by adding -emit-llvm to the clang command line).

Quuxplusone commented 4 years ago

Simon,
I got around the issue by modifying all the CMake instructions to only build
for gfx900, (My current card) since I noticed this happens on the gfx803
target.   I'll revert and rebuild and try to get the input files, commands, and
IR tomorrow.

Eli,
Sorry about that. I posted the repro steps in the wrong bugzilla. :)

Steps to Reproduce:
+ Build LLVM from source (on github)
+ Build AMD Device Libs
+ Build HIP
+ build latest Pytorch from github

+Build LLVM from source
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir -p build && cd build

cmake -G Ninja -DCMAKE_INSTALL_PREFIX=/opt/rocm/llvm -DCMAKE_BUILD_TYPE=Release
-DLLVM_ENABLE_ASSERTIONS=1 -DLLVM_TARGETS_TO_BUILD="AMDGPU;X86" -
DLLVM_EXTERNAL_LLD_SOURCE_DIR=../lld -DLLVM_EXTERNAL_CLANG_SOURCE_DIR=../clang
../llvm

ninja
sudo ninja install

+ Build Device Libs
export PATH=/opt/rocm/llvm/bin:$PATH
git clone -b amd-stg-open https://github.com/RadeonOpenCompute/ROCm-Device-
Libs.git
cd ROCm-Device-Libs
mkdir -p build && cd build
CC=clang CXX=clang++ cmake -G Ninja -DLLVM_DIR=/opt/rocm/llvm -
DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_WERROR=1 -DLLVM_ENABLE_ASSERTIONS=1 ..
ninja
sudo ninja install

+ build HIP
git clone -b master https://github.com/ROCm-Developer-Tools/HIP.git
cd HIP
mkdir -p build && cd build
cmake -G Ninja -DCMAKE_INSTALL_PREFIX=/opt/rocm/hip -DHIP_COMPILER=clang -
DCMAKE_BUILD_TYPE=Release ..
ninja
sudo ninja install

+Build PyTorch
export USE_NINJA=1
export MAX_JOBS=8
export HIP_PLATFORM=hcc
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git submodule update --init --recursive
python3 tools/amd_build/build_amd.py
pip3 install -r requirements.txt
python3 setup.py install --user

Quuxplusone / LLVMBugzillaTest

LLVM 11.0 & Pytorch HIP: Cannot select: t14: v2f64 = extract_subvector t11, Constant:i32<2> #45037