Open turmoni opened 2 years ago
ORT has two different kinds of execution providers;
So, here the questions is: which way rocm wants to go? It would determine if rocm should be skipped in gen_def.py or not.
This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.
Describe the bug Attempting to build onnxruntime on Linux with both
--use_rocm
and--build_shared_lib
leads to several errors that block compilation. This is both with the v1.10.0 and master branches, but I'm only reporting issues that are still present in master.Before I go any further, I would like to make it clear that I'm not a direct user of onnxruntime, nor do I understand anything about CMake or the build process - the changes I've made seem to work for me, but I'm definitely not saying they're correct and free from downstream issues, or that some of the later build issues aren't caused by changes I've made earlier on, and I'm not 100% sure I've ended up with a functional ROCm version. For all I know, building this way might not be a sensible thing to do. I also see that there's a MIGraphX option and I don't know what the distinction is between that and just using ROCm, so if one supersedes the other this may not be worth anyone caring about beyond making that clear.
The command line I'm using is:
(As an aside, the first issue I ran into was not realising that
--rocm_home
is required even if ROCm is installed in the default location, this comment is mainly for anyone else who might be attempting to reproduce my steps)cmake/onnxruntime.cmake
The first issue is:
From looking at potentially related changes, I removed
${PROVIDERS_ROCM}
fromset(onnxruntime_INTERNAL_LIBRARIES
incmake/onnxruntime.cmake
(line 171).C++ included in
generated_source.c
The file generated by
tools/ci_build/gen_def.py
includesonnxruntime/core/providers/rocm/rocm_provider_factory.h
, which in turn includesinclude/onnxruntime/core/framework/provider_options.h
, which is a C++ header. I see that several headers are already skipped so I did the same for ROCm:I also made it skip
rocm
when reading all thesymbols.txt
files, but I imagine this is meant to be changed somewhere else, if it is in fact meant to be omitted.make install
The final issue is with the
make install
step. In the generatedLinux/Release/cmake_install.cmake
, the following block attempts to install something that doesn't exist:For my fix, I just removed that section from the generated file, I didn't investigate its source.
Urgency None.
System information
To Reproduce
./build.sh --use_rocm --rocm_home=/opt/rocm --update --build --config=Release --build_shared_lib
Expected behavior The project builds
Screenshots N/A
Additional context N/A