visionml / pytracking

Visual tracking library based on PyTorch.
GNU General Public License v3.0
3.26k stars 608 forks source link

Error building extension '_prroi_pooling' #408

Closed setarekhosravi closed 1 year ago

setarekhosravi commented 1 year ago

I have this problem when I run run_video.py or run_webcam.py:

python run_video.py dimp dimp50 /media/strh/MyDrive/Track/OtherTracking/tapnet/tapnet_results/second/InShot_20231113_162218069.mp4
Using /home/strh/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/strh/.cache/torch_extensions/py310_cu117/_prroi_pooling/build.ninja...
Building extension module _prroi_pooling...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7/include -isystem /home/strh/anaconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o 
FAILED: prroi_pooling_gpu.o 
c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7/include -isystem /home/strh/anaconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o 
/media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c:17:10: fatal error: THC/THC.h: No such file or directory
   17 | #include <THC/THC.h>
      |          ^~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/home/strh/anaconda3/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/run_video.py", line 38, in <module>
    main()
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/run_video.py", line 34, in main
    run_video(args.tracker_name, args.tracker_param,args.videofile, args.optional_box, args.debug, args.save_results)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/run_video.py", line 20, in run_video
    tracker.run_video_generic(videofilepath=videofile, optional_box=optional_box, debug=debug, save_results=save_results)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/evaluation/tracker.py", line 395, in run_video_generic
    out = tracker.track(frame, info)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/evaluation/multi_object_wrapper.py", line 165, in track
    out = self.trackers[obj_id].initialize(image, init_info_split[obj_id])
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/tracker/dimp/dimp.py", line 84, in initialize
    self.init_classifier(init_backbone_feat)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/tracker/dimp/dimp.py", line 573, in init_classifier
    self.target_filter, _, losses = self.net.classifier.get_filter(x, target_boxes, num_iter=num_iter,
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/models/target_classifier/linear_filter.py", line 94, in get_filter
    weights = self.filter_initializer(feat, bb)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/models/target_classifier/initializer.py", line 164, in forward
    weights = self.filter_pool(feat, bb)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/models/target_classifier/initializer.py", line 45, in forward
    return self.prroi_pool(feat, roi1)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/external/PreciseRoIPooling/pytorch/prroi_pool/prroi_pool.py", line 28, in forward
    return prroi_pool2d(features, rois, self.pooled_height, self.pooled_width, self.spatial_scale)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 44, in forward
    _prroi_pooling = _import_prroi_pooling()
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 30, in _import_prroi_pooling
    _prroi_pooling = load_extension(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension '_prroi_pooling'

and this is cuda, torch, torchvision and GCC version:

gcc --version
gcc (Ubuntu 13.1.0-8ubuntu1~20.04.2) 13.1.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

nvidia-smi
Tue Nov 14 15:30:04 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:02:00.0 Off |                  N/A |
| N/A   46C    P8    N/A /  N/A |      9MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1136      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      1735      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

torch.__version__
'2.0.1+cu117'
>>> torchvision.__version__
'0.15.2'

please help me to solve this problem.

eliafranc commented 1 year ago

Hey,

I was having similar issues and what worked out for me in the end was to actually clone the PreciseRoIPooling repository into the external directory directly. I downloaded the zip file for the PreciseRoIPooling and unpacked it instead and that was causing me problems. PreciseRoIPooling state the following in their Readme:

Causion: To install the library, please git clone the repository instead of downloading the zip file, since source files inside the folder ./pytorch/prroi_pool/src/ and tensorflow/prroi_pool/src/kernels/external are symbol-linked. Downloading the repository as a zip file will break these symbolic links. Also, there are reports indicating that Windows git versions also breaks the symbol links. See https://github.com/vacancy/PreciseRoIPooling/issues/58.

I guess if one would clone the pytracking repository with something like git clone --recurse-submodules it should be fine. EDIT: I just found out more: The pytracking submodule point to a older version of the PreciseRoIPooling repository. When clicking on the submodule in /ltr/external/ check out the branch you are on on the repository you get redirected to. When switching to the main branch it becomes obvious that some fixing commits were pushed that are crucial for building.

Hope this helps, cheers

setarekhosravi commented 1 year ago

@eliafranc Thank you, but it didn't work for me. now the error is a little bit different.

Using /home/strh/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/strh/.cache/torch_extensions/py310_cu117/_prroi_pooling/build.ninja...
Building extension module _prroi_pooling...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /usr/local/cuda-11.7:/bin/nvcc  -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7:/include -isystem /home/strh/anaconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++17 -c /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o 
FAILED: prroi_pooling_gpu_impl.cuda.o 
/usr/local/cuda-11.7:/bin/nvcc  -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7:/include -isystem /home/strh/anaconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++17 -c /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o 
/bin/sh: 1: /usr/local/cuda-11.7:/bin/nvcc: not found
[2/3] c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7:/include -isystem /home/strh/anaconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o 
FAILED: prroi_pooling_gpu.o 
c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/strh/anaconda3/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7:/include -isystem /home/strh/anaconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o 
In file included from /media/strh/MyDrive/Track/OtherTracking/pytracking/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c:15:
/home/strh/anaconda3/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: No such file or directory
    5 | #include <cuda_runtime_api.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/home/strh/anaconda3/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/run_video.py", line 38, in <module>
    main()
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/run_video.py", line 34, in main
    run_video(args.tracker_name, args.tracker_param,args.videofile, args.optional_box, args.debug, args.save_results)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/run_video.py", line 20, in run_video
    tracker.run_video_generic(videofilepath=videofile, optional_box=optional_box, debug=debug, save_results=save_results)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/evaluation/tracker.py", line 395, in run_video_generic
    out = tracker.track(frame, info)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/evaluation/multi_object_wrapper.py", line 165, in track
    out = self.trackers[obj_id].initialize(image, init_info_split[obj_id])
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/tracker/dimp/dimp.py", line 84, in initialize
    self.init_classifier(init_backbone_feat)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../pytracking/tracker/dimp/dimp.py", line 573, in init_classifier
    self.target_filter, _, losses = self.net.classifier.get_filter(x, target_boxes, num_iter=num_iter,
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/models/target_classifier/linear_filter.py", line 94, in get_filter
    weights = self.filter_initializer(feat, bb)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/models/target_classifier/initializer.py", line 164, in forward
    weights = self.filter_pool(feat, bb)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/models/target_classifier/initializer.py", line 45, in forward
    return self.prroi_pool(feat, roi1)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/external/PreciseRoIPooling/pytorch/prroi_pool/prroi_pool.py", line 28, in forward
    return prroi_pool2d(features, rois, self.pooled_height, self.pooled_width, self.spatial_scale)
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 44, in forward
    _prroi_pooling = _import_prroi_pooling()
  File "/media/strh/MyDrive/Track/OtherTracking/pytracking/pytracking/../ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 30, in _import_prroi_pooling
    _prroi_pooling = load_extension(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/strh/anaconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension '_prroi_pooling'

Edit: I've changed the CUDA path in .bashrc but I have the same error.

setarekhosravi commented 1 year ago

I think the problem is PyTorch and CUDA versions. I should install the exact version that was used in this project.

eliafranc commented 1 year ago

Depending on which CUDA version you need, you might be able to use the docker image I wrote that works with pytracking. I used version 12.2 but if you need a lower one you can try to change it in the Dockerfile from FROM nvidia/cuda:12.2.0-devel-ubuntu20.04 to FROM nvidia/cuda:11.7.1-devel-ubuntu20.04.

setarekhosravi commented 1 year ago

@eliafranc Thank you, I'll try it.

setarekhosravi commented 1 year ago

I found that the problem is exactly the PyTorch version. By downgrading Pytorch from 2.0.1 to 1.13.0, the problem has been solved. @eliafranc I used your tips and now I can use the repo, Thank you very much for your patience and attention.

ThePengg commented 10 months ago

No module named 'prroi_pool'

Obadahheyari commented 9 months ago

same problem please help me

eliafranc commented 8 months ago

did you initialize the submodule? @ThePengg @Obadahheyari

run git submodule update --init --recursive