rlleshi / phar

deep learning sex position classifier
Apache License 2.0
227 stars 26 forks source link

Issues trying to get the demo working #6

Closed skier233 closed 1 year ago

skier233 commented 1 year ago

I've been following the steps to install and spent several hours today trying to get the demo working but I'm getting errors. I've followed the steps in the instructions exactly. When doing a manual install, I get this error: No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin' Traceback (most recent call last): File "c:\users\tyler\source\repos\phar\mmaction2\demo\demo_skeleton.py", line 16, in from mmdet.apis import inference_detector, init_detector File "C:\Users\tyler\source\repos\venv\lib\site-packages\mmdet\apis__init__.py", line 2, in from .inference import (async_inference_detector, inference_detector, File "C:\Users\tyler\source\repos\venv\lib\site-packages\mmdet\apis\inference.py", line 8, in from mmcv.ops import RoIPool File "C:\Users\tyler\source\repos\venv\lib\site-packages\mmcv\ops__init.py", line 2, in from .active_rotated_filter import active_rotated_filter File "C:\Users\tyler\source\repos\venv\lib\site-packages\mmcv\ops\active_rotated_filter.py", line 8, in ext_module = ext_loader.load_ext( File "C:\Users\tyler\source\repos\venv\lib\site-packages\mmcv\utils\ext_loader.py", line 13, in load_ext ext = importlib.import_module('mmcv.' + name) File "C:\Users\tyler\AppData\Local\Programs\Python\Python38\lib\importlib\init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) ImportError: DLL load failed while importing _ext: The specified module could not be found.

When trying to use docker, I get an error about not having an nvidia driver (even though I do with cuda, and cudnn setup)

rlleshi commented 1 year ago

Unfortunately, I am not familiar with the installation on Windows. Looks like mmcv is having some troubles. Perhaps check this issue once (and otherwise check the mmcv repo for similar issues when installing on windows)?

Could you please share the exact error that you get with docker?

skier233 commented 1 year ago

I saw that issue but didn't see a solution there. I'll try with docker again today and post my commands and error.

skier233 commented 1 year ago

With docker, I get the following error:

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
...
...
Moviepy - Building video temp/video.mp4.
MoviePy - Writing audio in videoTEMP_MPY_wvf_snd.mp3
MoviePy - Done.
Moviepy - Writing video temp/video.mp4

Moviepy - Done !
Moviepy - video ready temp/video.mp4
load checkpoint from local path: checkpoints/har/timeSformer.pth
Traceback (most recent call last):
  File "src/demo/multimodial_demo.py", line 605, in <module>
    main()
  File "src/demo/multimodial_demo.py", line 544, in main
    RGB_MODEL = init_recognizer(args.rgb_config,
  File "/workspace/phar/mmaction2/mmaction/apis/inference.py", line 51, in init_recognizer
    model.to(device)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 673, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 409, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 671, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
  File "/opt/conda/lib/python3.8/site-packages/torch/cuda/__init__.py", line 170, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

after running these commands:

docker build -f docker/Dockerfile . -t rlleshi/phar
docker run --mount type=bind,source="$(pwd)",target=/app -dit rlleshi/phar
docker ps -a
docker attach 45580c0d891e
cd ../..
cd app
python src/demo/multimodial_demo.py video.mp4 demo.mp4
rlleshi commented 1 year ago

And you have CUDA installed in your system?

skier233 commented 1 year ago

yea. Its installed. CuDNN as well.

skier233 commented 1 year ago

My guess is maybe its having an issue finding the gpu through the container? Not entirely sure. I've been training NN's with tensorflow and GPU with Cuda so it isn't strictly a CUDA or CUDNN issue probably.

rlleshi commented 1 year ago

Is the gpu available inside the container? You can try to execute torch.cuda.is_available() inside the container.

How are you running the container? With docker or nvidia-docker? You need to run it with the latter to make the GPUs available for the container.

skier233 commented 1 year ago

Thats prob the issue. Was running with docker. Will try nvidia-docker.