Why is the inference code for video so slow?

The error occurs when installing halpecocotools, so I installed it manually with pip:
pip install halpecocotools
When I re-ran setup, SciPy now complained that the minimum supported version is Python 3.9, so I re-installed it with pip:
pip install scipy
Now running setup.py doesn't give any errors and installs successfully.

I cannot install the 'halpecocotools' by using the command 'pip install halpecocotools'. For that reason, I downgrade my cython version to 0.29.36 follow this. But when I ran the setup.py again I met this error:

OSError: /home/anhay20171/anaconda3/envs/alphapose_0/lib/python3.7/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 9, in <module>
    from torch.utils.cpp_extension import BuildExtension, CUDAExtension
  File "/home/anhay20171/anaconda3/envs/alphapose_0/lib/python3.7/site-packages/torch/__init__.py", line 217, in <module>
    _load_global_deps()
  File "/home/anhay20171/anaconda3/envs/alphapose_0/lib/python3.7/site-packages/torch/__init__.py", line 178, in _load_global_deps
    _preload_cuda_deps()
  File "/home/anhay20171/anaconda3/envs/alphapose_0/lib/python3.7/site-packages/torch/__init__.py", line 158, in _preload_cuda_deps
    ctypes.CDLL(cublas_path)
  File "/home/anhay20171/anaconda3/envs/alphapose_0/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/anhay20171/anaconda3/envs/alphapose_0/lib/python3.7/site-packages/nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11

I solved the above error by running the below command follow this: https://github.com/hpcaitech/ColossalAI/issues/2901:

conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch

But when I run the command pip install git+ssh://git@github.com/facebookresearch/pytorch3d.git@stable, I met this error

Collecting git+ssh://****@github.com/facebookresearch/pytorch3d.git@stable
  Cloning ssh://****@github.com/facebookresearch/pytorch3d.git (to revision stable) to /tmp/pip-req-build-ubm5dza1
  Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/facebookresearch/pytorch3d.git' /tmp/pip-req-build-ubm5dza1
  git@github.com: Permission denied (publickey).
  fatal: Could not read from remote repository.

  Please make sure you have the correct access rights
  and the repository exists.
  error: subprocess-exited-with-error

  × git clone --filter=blob:none --quiet 'ssh://****@github.com/facebookresearch/pytorch3d.git' /tmp/pip-req-build-ubm5dza1 did not run successfully.
  │ exit code: 128
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet 'ssh://****@github.com/facebookresearch/pytorch3d.git' /tmp/pip-req-build-ubm5dza1 did not run successfully.
│ exit code: 128
╰─> See above for output.

For that reason I switch the above command into the one with https instead of ssh as below and it solved the problem:

 pip install git+https://git@github.com/facebookresearch/pytorch3d.git@stable

When I ran the inference command for image:

python scripts/demo_inference.py --cfg configs/halpe_26/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/halpe26_fast_res50_256x192.pth --indir examples/demo_0 --save_img

I get the error said something like this:

global /io/opencv/modules/videoio/src/cap_ffmpeg_impl.hpp (2927) open Could not find encoder for codec_id=27, error: Encoder not found

global /io/opencv/modules/videoio/src/cap_ffmpeg_impl.hpp (3002) open VIDEOIO/FFMPEG: Failed to initialize VideoWriter

For that reason, I rewrite this line (31) from Alphapose/alphapose/utils/detector.py:

self.fourcc = int(stream.get(cv2.CAP_PROP_FOURCC))

to this:

self.fourcc = cv2.VideoWriter_fourcc(*"mp4v")

it took 43s for one picture: image_2023_10_06T07_25_42_452Z

But when I ran the inference command for video:

./scripts/inference.sh configs/halpe_26/resnet/256x192_res50_lr1e-3_1x.yaml pretrained_models/halpe26_fast_res50_256x192.pth tada2.mp4

It gets stuck at 5/605 frame as below, and I don't understand why: Screenshot 2023-10-06 165907

MVIG-SJTU / AlphaPose

Why is the inference code for video so slow? #1182