Closed alex-razor closed 5 years ago
Hi, can you try modifying line 26 of 'demo.py' as below? torch.multiprocessing.set_start_method('spawn', force=True)
Hi, can you try modifying line 26 of 'demo.py' as below? torch.multiprocessing.set_start_method('spawn', force=True)
Thank you for your reply. However, it didn't help. same error.
Oh, it's so weird.. We have only tested for PyTorch 1.1 so far. Can you check if PyTorch 1.1 works for you?
That did work for me. Thanks!
RuntimeError: error executing torch_shm_manager at "/hdd/kps_pipeline/venv/lib/python3.6/site-packages/torch/bin/torch_shm_manager" at /pytorch/torch/lib/libshm/core.cpp:99
how can i solve it?
I'm also hitting this, but on torch==1.3.0
same on torch==1.3.0
os: MacOS 10.14.6
RuntimeError: error executing torch_shm_manager at "/hdd/kps_pipeline/venv/lib/python3.6/site-packages/torch/bin/torch_shm_manager" at /pytorch/torch/lib/libshm/core.cpp:99
how can i solve it?
do you know how to solve it? thank you!
I was seeing this error with 1.3.0. Upgrading to 1.3.1 fixed it for me.
@Abhipray I have torch==1.3.1 installed, but it isn't working for me. I get the same error. Has anyone found the solution to this problem?
I had the same problem. When I used the following versions, Alphapose worked and generated a Jason file for the images.
I created a virtual environment with Python 3.6. If you don't know how to do it, have a look at https://gist.github.com/frfahim/73c0fad6350332cef7a653bcd762f08d
I installed the latest version of PyTorch using
https://pytorch.org/
and selected CUDA 9.2 (Cuda 10.0 did not work)
I used (pip3 install torch==1.3.1+cu92 torchvision==0.4.2+cu92 -f https://download.pytorch.org/whl/torch_stable.html
)
I installed Cuda 9.2 from https://developer.nvidia.com/cuda-92-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal
Then follow the instruction of the Alphapose that says download the models
and:
git clone -b pytorch https://github.com/MVIG-SJTU/AlphaPose.git
-pip3 install -r requirements.txt
(remove the torch
and torchvision
and ntpath
from this file and then run this code)python3 demo.py --indir examples/demo --outdir examples/res
SUMMARY:
Linux 16.04
Python3.6
CUDA 9.2
CUDNN 7
torch==1.3.1+cu92
torchvision==0.4.2+cu92
GPU NVIDIA 2080ti
Hello !!! @Ehsan-Yaghoubi , how many FPS did you get ? Thanks
Hello !!! @Ehsan-Yaghoubi , how many FPS did you get ? Thanks
Hi, I only used it to produce the pose information for my own dataset. I didn't check the metrics as I didn't need them.
Hi !! @Ehsan-Yaghoubi thank for your reply !!
It still happens with PyTorch 1.4
Set num_workers=0
torch.multiprocessing.set_start_method('spawn', force=True) work well with num_works > 0 in macos
I was just able to fix this by commenting a line I had added to fix an issue on a different system:
Old: torch.multiprocessing.set_sharing_strategy('file_system')
New: # torch.multiprocessing.set_sharing_strategy('file_system')
I think the problem in my case might be caused by my system having CUDA 10.2 while Pytorch is installed as the 10.1 version. But commenting the above line at the start of my script fixed the problem, at least in my case.
@nlml works for me thanks!! i have pytorch 1.4 with cuda 10.2...
add --sp works fine for me
Hitting the same error:
(alphapose) zrrr@zrrr-GL552VW:~/Projects/AlphaPose$ python scripts/demo_inference.py --cfg configs/coco/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/fast_res50_256x192.pth --indir examples/demo/
Traceback (most recent call last):
File "scripts/demo_inference.py", line 175, in <module>
det_loader = DetectionLoader(input_source, get_detector(args), cfg, args, batchSize=args.detbatch, mode=mode, queueSize=args.qsize)
File "/home/zrrr/Projects/AlphaPose/detector/apis.py", line 12, in get_detector
from detector.yolo_api import YOLODetector
File "/home/zrrr/Projects/AlphaPose/detector/yolo_api.py", line 27, in <module>
from detector.nms import nms_wrapper
File "/home/zrrr/Projects/AlphaPose/detector/nms/__init__.py", line 1, in <module>
from .nms_wrapper import nms, soft_nms
File "/home/zrrr/Projects/AlphaPose/detector/nms/nms_wrapper.py", line 4, in <module>
from . import nms_cpu, nms_cuda
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory
Python 3.6.13
Cuda Toolkit 9.0
cudnn 7.6.5
torch 1.1.0
torchvision 0.3.0
How can I fix this?
Hitting the same error:
(alphapose) zrrr@zrrr-GL552VW:~/Projects/AlphaPose$ python scripts/demo_inference.py --cfg configs/coco/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/fast_res50_256x192.pth --indir examples/demo/
Traceback (most recent call last): File "scripts/demo_inference.py", line 175, in <module> det_loader = DetectionLoader(input_source, get_detector(args), cfg, args, batchSize=args.detbatch, mode=mode, queueSize=args.qsize) File "/home/zrrr/Projects/AlphaPose/detector/apis.py", line 12, in get_detector from detector.yolo_api import YOLODetector File "/home/zrrr/Projects/AlphaPose/detector/yolo_api.py", line 27, in <module> from detector.nms import nms_wrapper File "/home/zrrr/Projects/AlphaPose/detector/nms/__init__.py", line 1, in <module> from .nms_wrapper import nms, soft_nms File "/home/zrrr/Projects/AlphaPose/detector/nms/nms_wrapper.py", line 4, in <module> from . import nms_cpu, nms_cuda ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory
Python 3.6.13 Cuda Toolkit 9.0 cudnn 7.6.5 torch 1.1.0 torchvision 0.3.0
How can I fix this?
Could you try any version of torch >= 1.3.1 to see if the issue still there?
add --sp
is ok
I was just able to fix this by commenting a line I had added to fix an issue on a different system:
Old:
torch.multiprocessing.set_sharing_strategy('file_system')
New:
# torch.multiprocessing.set_sharing_strategy('file_system')
I think the problem in my case might be caused by my system having CUDA 10.2 while Pytorch is installed as the 10.1 version. But commenting the above line at the start of my script fixed the problem, at least in my case.
I had to do the same to make the code work on Linux. Any ideas why so strange?
Hi, can you try modifying line 26 of 'demo.py' as below? torch.multiprocessing.set_start_method('spawn', force=True)
Thanks, that work for me on the Linux!
Running code doesnt work. I get the following error:
Although, when i add flag
--sp
it works fine.