Unable to build onnxruntime_v1.10.0 C++ api with --enable_memory_profile --enable_cuda_profiling flags

abilashravi-ta commented 2 years ago

Description I am trying to build onnxruntime (v1.10.0) from source following these instructions with additional flags --enable_memory_profile --enable_cuda_profiling to have profiling enabled.

When I use the following command to build and install

git clone --recursive --branch v1.10.0 https://github.com/Microsoft/onnxruntime
cd onnxruntime
./build.sh --cuda_home /home/azureuser/miniconda3/envs/cuda_conda_v1/ --cudnn_home /usr/lib/x86_64-linux-gnu/ --use_cuda --enable_memory_profile --enable_cuda_profiling --config RelWithDebInfo --build_shared_lib --build_wheel --skip_tests --parallel 6

I get some build errors like this (I added set(CMAKE_VERBOSE_MAKEFILE on) in the CMakeLists.txt file to get more details)

cd /tmp/onnx_v1.10.0_v1/onnxruntime/build/Linux/RelWithDebInfo/external/onnx && /usr/local/bin/cmake -P CMakeFiles/onnx.dir/cmake_clean_target.cmake
cd /tmp/onnx_v1.10.0_v1/onnxruntime/build/Linux/RelWithDebInfo/external/onnx && /usr/local/bin/cmake -E cmake_link_script CMakeFiles/onnx.dir/link.txt --verbose=1
/usr/bin/ar qc libonnx.a CMakeFiles/onnx.dir/onnx/checker.cc.o CMakeFiles/onnx.dir/onnx/common/assertions.cc.o CMakeFiles/onnx.dir/onnx/common/interned_strings.cc.o CMakeFiles/onnx.dir/onnx/common/ir_pb_converter.cc.o CMakeFiles/onnx.dir/onnx/common/model_helpers.cc.o CMakeFiles/onnx.dir/onnx/common/path.cc.o CMakeFiles/onnx.dir/onnx/common/status.cc.o CMakeFiles/onnx.dir/onnx/defs/attr_proto_util.cc.o CMakeFiles/onnx.dir/onnx/defs/controlflow/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/controlflow/old.cc.o CMakeFiles/onnx.dir/onnx/defs/data_type_utils.cc.o CMakeFiles/onnx.dir/onnx/defs/function.cc.o CMakeFiles/onnx.dir/onnx/defs/generator/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/generator/old.cc.o CMakeFiles/onnx.dir/onnx/defs/logical/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/logical/old.cc.o CMakeFiles/onnx.dir/onnx/defs/math/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/math/old.cc.o CMakeFiles/onnx.dir/onnx/defs/nn/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/nn/old.cc.o CMakeFiles/onnx.dir/onnx/defs/object_detection/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/object_detection/old.cc.o CMakeFiles/onnx.dir/onnx/defs/optional/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/parser.cc.o CMakeFiles/onnx.dir/onnx/defs/printer.cc.o CMakeFiles/onnx.dir/onnx/defs/quantization/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/quantization/old.cc.o CMakeFiles/onnx.dir/onnx/defs/reduction/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/reduction/old.cc.o CMakeFiles/onnx.dir/onnx/defs/rnn/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/rnn/old.cc.o CMakeFiles/onnx.dir/onnx/defs/schema.cc.o CMakeFiles/onnx.dir/onnx/defs/sequence/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/shape_inference.cc.o CMakeFiles/onnx.dir/onnx/defs/tensor/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/tensor/old.cc.o CMakeFiles/onnx.dir/onnx/defs/tensor/utils.cc.o CMakeFiles/onnx.dir/onnx/defs/tensor_proto_util.cc.o CMakeFiles/onnx.dir/onnx/defs/tensor_util.cc.o CMakeFiles/onnx.dir/onnx/defs/traditionalml/defs.cc.o CMakeFiles/onnx.dir/onnx/defs/traditionalml/old.cc.o CMakeFiles/onnx.dir/onnx/defs/training/defs.cc.o CMakeFiles/onnx.dir/onnx/onnxifi_utils.cc.o CMakeFiles/onnx.dir/onnx/shape_inference/implementation.cc.o CMakeFiles/onnx.dir/onnx/version_converter/convert.cc.o CMakeFiles/onnx.dir/onnx/version_converter/helper.cc.o
/usr/bin/ranlib libonnx.a
make[2]: Leaving directory '/tmp/onnx_v1.10.0_v1/onnxruntime/build/Linux/RelWithDebInfo'
[ 44%] Built target onnx
make[1]: Leaving directory '/tmp/onnx_v1.10.0_v1/onnxruntime/build/Linux/RelWithDebInfo'
make: *** [Makefile:169: all] Error 2
Traceback (most recent call last):
  File "/tmp/onnx_v1.10.0_v1/onnxruntime/tools/ci_build/build.py", line 2362, in <module>
    sys.exit(main())
  File "/tmp/onnx_v1.10.0_v1/onnxruntime/tools/ci_build/build.py", line 2282, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "/tmp/onnx_v1.10.0_v1/onnxruntime/tools/ci_build/build.py", line 1174, in build_targets
    run_subprocess(cmd_args, env=env)
  File "/tmp/onnx_v1.10.0_v1/onnxruntime/tools/ci_build/build.py", line 639, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
  File "/tmp/onnx_v1.10.0_v1/onnxruntime/tools/python/util/run.py", line 42, in run
    completed_process = subprocess.run(
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/local/bin/cmake', '--build', '/tmp/onnx_v1.10.0_v1/onnxruntime/build/Linux/RelWithDebInfo', '--config', 'RelWithDebInfo', '--', '-j6']' returned non-zero exit status 2.

System information

OS Platform and Distribution: Ubuntu 18.04.6 LTS x86_64
ONNX Runtime installed from (source or binary): source (using the link mentioned above)
G++ Compiler version: 7.5.0

Here is the full build recording build_recording.txt

There build file generation is successful when I use

./build.sh --cuda_home /usr/local/cuda --cudnn_home /usr/lib/x86_64-linux-gnu/ --use_cuda --config RelWithDebInfo --build_shared_lib --build_wheel --skip_tests --parallel 6

But it fails when I use

./build.sh --cuda_home /usr/local/cuda --cudnn_home /usr/lib/x86_64-linux-gnu/ --use_cuda --enable_memory_profile --enable_cuda_profiling --config RelWithDebInfo --build_shared_lib --build_wheel --skip_tests --parallel 6

I have also added the location of cupti library to PATH. This is how my PATH variable looks like

$ echo $PATH
/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/cuda-11.4/targets/x86_64-linux/lib
$ ls /usr/local/cuda-11.4/targets/x86_64-linux/lib | grep cupti
libcupti.so
libcupti.so.11.4
libcupti.so.2021.2.2
libcupti_static.a

Any suggestion would be helpful.!

RyanUnderhill commented 2 years ago

Looks like a bug on this line: https://github.com/microsoft/onnxruntime/blob/0292356bd7ce5b379be8ee9820bfbf4fd321ebc5/onnxruntime/core/framework/allocation_planner.cc#L889

Where it's ignoring the return value, and it's marked as no_discard. Marked as a bug. Will update here once the fix is known in case you want to patch it locally.

RyanUnderhill commented 2 years ago

Change merged! (had 1 more change than what I mentioned above). You might be able to cherrypick it into your local branch, but its trivial to manually apply.

abilashravi-ta commented 2 years ago

@RyanUnderhill, I added the changes in ceaaf397ac0d50dc1a3579dfa8483b1c85518037 to my local and I'm able to build onnxruntime (v1.10.0) successfully with --enable_memory_profile and --enable_cuda_profiling flags.!

I'm referring to this Onnx performance tuning section. The .json file that it talks about can be generated by building onnxruntime with just --enable_cuda_profiling flag. It does not mention how to do memory profiling i.e. what additional info/features I get by building with --enable_memory_profile flag. Can u point out me to some resource.?

hychiang-git commented 1 year ago

same question

RyanUnderhill commented 1 year ago

@abilashravi-ta @ken012git There is information on the pull request that added it here: https://github.com/microsoft/onnxruntime/pull/5658

microsoft / onnxruntime

Unable to build onnxruntime_v1.10.0 C++ api with --enable_memory_profile --enable_cuda_profiling flags #11607