timmeinhardt / trackformer

Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]
https://arxiv.org/abs/2101.02702
Apache License 2.0
487 stars 113 forks source link

The cuda version I use is 11.1. What version of torch and torch vision should I use to reproduce #75

Closed niangea closed 1 year ago

SushantGautam commented 1 year ago

Same

timmeinhardt commented 1 year ago

The installation readme mentions the required PyTorch and Torchvision versions. If you need to run different versions for newer CUDA versions you are on your own, i.e., you need to try and see if the code runs without errors. In particular, if the manually compiled code compiles without errors. In any case you need to try-and-error and see if it works.

tostenzel commented 1 year ago

Hi, as Tim has written in a previous issue (https://github.com/timmeinhardt/trackformer/issues/41), the installation readme is wrong. He mentioned "2. Install PyTorch 1.7 and torchvision 0.8 ". With PyTorch 1.7, you can use cuda 11.1 (see the mentioned table from the PyTorch link)!

There is a bug in torchvision 0.8. I am still working through everything but pip install torch==1.7.0+cu110 torchvision==0.8.1+cu110 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html works for me until now.

I will post again, if my attempt did not work out.

tostenzel commented 1 year ago

It worked for me, although I had to choose cuda 11.0 for my GPU. I guess you will find a way with PyTorch 1.7.0 if you try hard enough. My strategy was to get Trackformer's main dependency to work separately before turning to the complete Trackformer package.

Edit: Below is exactly what I did:

Install Python in e.g. home

Clone project

Conda Environment with Python 3.7

PyTorch

Unfortunatley, if we install the cudatoolkit via conda, the nvcc compiler does not come with it. Therefore, we need to install it on top manually to prevent CUDA from choosing the inappropropriate, locally pre-installed version in /usr/local/cuda. Unfavorably, there is no compiler for our CUDA 11.0 (that we have chosen for PyTorch 1.7.1). Therefore, I choose a higher version one more time:

Trackformer's main dependency is MultiScaleDeformableAttention from the Deformable-DETR repository. That program is used for detecting images (not videos) as a more efficient DETR version. Installing this program is tricky. Thus, we do it before the other Trackformer requirements.

We first install pycocotools (with fixed ignore flag) according to the Trackformer installation guide. That dependency is also used by Deformable-DETR without the version specification.

We continue with the Deformable-DETR requirements that I copied from the respective (repository)[https://github.com/fundamentalvision/Deformable-DETR/blob/main/requirements.txt]:

Next, we install MultiScaleDeformableAttention from the local files in this repository with

Finally, we test whether the installation was succesful:

cd src/trackformer/models/ops
# unit test (should see all checking is True)
python test.py
cd ../../../..

Trackformer

At last, we install the other Trackformer requirements. Note that I have changed numpy to a more recent numpy version. Without that, we would get an DimensionMismatchError from running Trackformer's src/track.py.

We test whether the Trackformer installation was successful in two ways:

Installation Validation

1. Evaluation

Either download the MOT17(https://motchallenge.net/data/MOT17/) dataset to the data folder via

Then, download and unpack the pretrained TrackFormer model files in the models directory:

Next, evaluate the pre-trained MOT17 models with MOT20 metrics via

2. Training

Try to train Trackformer on the MOT17 dataset for some batches via

niangea commented 1 year ago

@tostenzel I followed your method to install the environment, but encountered a problem while installing the MultiScaleDeformableAttention package,in issue # 96

niangea commented 1 year ago

This is my current environment: PyTorch version: 1.7.0+cu110 Is debug build: True CUDA used to build PyTorch: 11.0 ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.3 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: version 3.26.3

Python version: 3.7 (64-bit runtime) Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: Nvidia driver version: Could not collect cuDNN version: Probably one of the following: /opt/orion/lib/orion-cuda-11.0/libcudnn_orion.so.11.0.8 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.18.5 [pip3] torch==1.7.0+cu110 [pip3] torchfile==0.1.0 [pip3] torchvision==0.8.1+cu110 [conda] numpy 1.18.5 pypi_0 pypi [conda] torch 1.7.0+cu110 pypi_0 pypi [conda] torchfile 0.1.0 pypi_0 pypi [conda] torchvision 0.8.1+cu110 pypi_0 pypi

timmeinhardt commented 1 year ago

You still do not have the correct PyTorch version. So these errors can come up. I can not debug your system for other configurations than the one we suggested.