DirtyHarryLYL / HAKE-Action-Torch

HAKE-Action in PyTorch
Apache License 2.0
216 stars 46 forks source link

Get error while running demo.py #62

Open xsvv1 opened 2 years ago

xsvv1 commented 2 years ago

Thanks for your great job! When i run demo i get error like this: [06/26 15:24:06][INFO] Activity2Vec: cfg:

[06/26 15:24:06][INFO] Activity2Vec: BENCHMARK: SHOW_ACTION_RES: False DATA: ANNO_DB_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/Trainval_HAKE DATA_DIR: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data FULL_SET_NAMES: ['hico-train', 'hico-test', 'hcvrd', 'openimage', 'vcoco', 'pic', 'long_tail_1', 'long_tail_2', 'collect'] IMAGE_FOLDER_LIST: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/data_path.json NUM_PARTS: 10 NUM_PASTAS: ARM: 8 FOOT: 16 HAND: 34 HEAD: 14 HIP: 6 LEG: 15 NUM_VERBS: 157 PASTA_LANGUAGE_MATRIX_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/pasta_language_matrix.npy PASTA_NAMES: ['foot', 'leg', 'hip', 'hand', 'arm', 'head'] PASTA_NAME_LIST: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/Part_State_93_new.txt PASTA_WEIGHTS_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/loss_weights.npy PRED_DB_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/Test_pred_rcnn SKELETON_SIZE: 64 TEST_GT_PASTA_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/gt_pasta_data.pkl TEST_GT_VERB_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/gt_verb_data.pkl VERB_NAME_LIST: /home/Xu/HAKE-Action-Torch-Activity2Vec/Data/metadata/verb_list_new.txt DEBUG: False DEMO: A2V_CFG: /home/Xu/HAKE-Action-Torch-Activity2Vec/configs/a2v/a2v.yaml A2V_WEIGHT: /home/Xu/HAKE-Action-Torch-Activity2Vec/checkpoints/a2v/pretrained_model.pth DETECTOR: yolo EXCLUDED_VERBS: [128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 57, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127] FONT_PATH: /home/Xu/HAKE-Action-Torch-Activity2Vec/tools/inference_tools/consola.ttf FONT_SIZE: 18 MAX_HUMAN_NUM: 5 POSE_CFG: /home/Xu/HAKE-Action-Torch-Activity2Vec/configs/pose/256x192_res50_lr1e-3_1x.yaml POSE_WEIGHT: /home/Xu/HAKE-Action-Torch-Activity2Vec/checkpoints/pose/fast_res50_256x192.pth SCORE_THRES: 1.5 TRACKER_WEIGHT: /home/Xu/HAKE-Action-Torch-Activity2Vec/checkpoints/yolo/osnet.pth YOLO_CFG: /home/Xu/HAKE-Action-Torch-Activity2Vec/configs/yolo/yolov3-spp.cfg YOLO_WEIGHT: /home/Xu/HAKE-Action-Torch-Activity2Vec/checkpoints/yolo/yolov3-spp.weights GPU_ID: 0 LOG_DIR: /home/Xu/HAKE-Action-Torch-Activity2Vec/logs MODEL: DROPOUT: 0.5 MODULE_TRAINED: ['verb'] NUM_FC: 512 PART_AGG_RULE: [[0, 3], [1, 2], [4], [6, 9], [7, 8], [5]] PART_ROI_ENABLE: True POSE_MAP: True SKELETON_DIM: 2704 VERB_ONE_MORE_FC: False MODEL_NAME: default PIXEL_MEANS: [[[102.9801, 115.9465, 122.7717]]] POOLING_SIZE: 7 RNG_SEED: 3 ROOT_DIR: /home/Xu/HAKE-Action-Torch-Activity2Vec TEST: HUMAN_SCORE_ENHANCE: True NUM_WORKERS: 1 OUTPUT_DIR: WEIGHT_PATH: TRAIN: BASE_LR: 0.0025 CHECKPOINT_INTERVAL: 50000 CHECKPOINT_PATH: COMBINE_PASTA: False DATA_SPLITS: ['hico-train'] DISPLAY_INTERVAL: 10 FREEZE_BACKBONE: True FREEZE_RES4: True HUMAN_PER_IM: 10 IM_BATCH_SIZE: 1 LOAD_HISTORY: False LOSS_TYPE: bce LOSS_WEIGHT_K: 2 LR_SCHEDULE: cosine MAX_EPOCH: 100 MOMENTUM: 0.9 NUM_WORKERS: 1 POS_RATIO: 0.1 SHOW_INTERVAL: 1000 SHOW_LOSS_CURVE: True WITH_LOSS_WTS: True WEIGHT_DIR: /home/Xu/HAKE-Action-Torch-Activity2Vec/Weights [06/26 15:24:07][INFO] Activity2Vec: Loading AlphaPose model from /home/Xu/HAKE-Action-Torch-Activity2Vec/checkpoints/pose/fast_res50_256x192.pth... [06/26 15:33:44][INFO] Activity2Vec: Loading Activity2Vec model from /home/Xu/HAKE-Action-Torch-Activity2Vec/checkpoints/a2v/pretrained_model.pth... [06/26 15:33:44][INFO] Activity2Vec: [Input] Image directory detected. 0%| | 0/9658 [00:00<?, ?it/s] THCudaCheck FAIL file=/pytorch/aten/src/THCUNN/generic/LeakyReLU.cu line=29 error=209 : no kernel image is available for execution on the device 0%| | 0/9658 [00:00<?, ?it/s] Traceback (most recent call last): File "tools/demo.py", line 211, in image_list = read_input(args) File "tools/demo.py", line 45, in inference pose = self.alphapose.process(image_path, alpha_image) File "/home/Xu/HAKE-Action-Torch-Activity2Vec/tools/inference_tools/pose_inference.py", line 241, in process pose = None File "/home/Xu/HAKE-Action-Torch-Activity2Vec/tools/inference_tools/pose_inference.py", line 72, in process self.image_detection() File "/home/Xu/HAKE-Action-Torch-Activity2Vec/tools/inference_tools/pose_inference.py", line 103, in image_detection dets = self.detector.images_detection(imgs, im_dim_list) File "/home/Xu/HAKE-Action-Torch-Activity2Vec/tools/inference_tools/detector/yolo_api.py", line 92, in images_detection prediction = self.model(imgs) File "/home/a607-promax/anaconda3/envs/activity2vec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, kwargs) File "/home/Xu/HAKE-Action-Torch-Activity2Vec/tools/inference_tools/detector/yolo/darknet.py", line 332, in forward x = self.module_listi File "/home/a607-promax/anaconda3/envs/activity2vec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "/home/a607-promax/anaconda3/envs/activity2vec/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/home/a607-promax/anaconda3/envs/activity2vec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(input, kwargs) File "/home/a607-promax/anaconda3/envs/activity2vec/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 559, in forward return F.leaky_relu(input, self.negative_slope, self.inplace) File "/home/a607-promax/anaconda3/envs/activity2vec/lib/python3.7/site-packages/torch/nn/functional.py", line 1061, in leaky_relu result = torch._C._nn.leakyrelu(input, negative_slope) RuntimeError: cuda runtime error (209) : no kernel image is available for execution on the device at /pytorch/aten/src/THCUNN/generic/LeakyReLU.cu:29

Here is my env: PyTorch version: 1.4.0 Is debug build: No CUDA used to build PyTorch: 10.1

OS: Ubuntu 20.04.4 LTS GCC version: (Ubuntu 7.5.0-6ubuntu2) 7.5.0 CMake version: Could not collect

Python version: 3.7 Is CUDA available: Yes CUDA runtime version: 10.0.130 GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090 Nvidia driver version: 510.73.05 cuDNN version: Probably one of the following: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn.so.8.4.0 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.4.0 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.4.0 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.4.0 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.4.0 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.4.0 /usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.4.0

Versions of relevant libraries: [pip3] numpy==1.21.6 [pip3] torch==1.4.0 [pip3] torchfile==0.1.0 [pip3] torchvision==0.5.0 [conda] torch 1.4.0 pypi_0 pypi [conda] torchfile 0.1.0 pypi_0 pypi [conda] torchvision 0.5.0 pypi_0 pypi This error seems related to CUDA or CUDNN. I can not fix it.

gitDaxian commented 1 year ago

Hello, I have encountered the same problem as you. Have you solved it?

DirtyHarryLYL commented 1 year ago

It seems like a torch install problem, please check: https://github.com/pytorch/pytorch/issues/31285

gitDaxian commented 1 year ago

It seems like a torch install problem, please check: pytorch/pytorch#31285 Thank you so much for your reply! I try to upgrade pytorch but the following error was encountered: activity2vec 1.0.0 requires torch==1.4.0, but you have torch 1.13.1 which is incompatible. activity2vec 1.0.0 requires torchvision==0.5.0, but you have torchvision 0.14.1 which is incompatible.

I'm trying to rebuild Alphapose and activity2vec that is run "cd AlphaPose && python setup.py build develop && cd .." but it show other errors,it seems need torch==1.4.0 and torchvision==0.5.0 It seems a endless loop...

Dxymiemiemie commented 3 months ago

It seems like a torch install problem, please check: pytorch/pytorch#31285 Thank you so much for your reply! I try to upgrade pytorch but the following error was encountered: activity2vec 1.0.0 requires torch==1.4.0, but you have torch 1.13.1 which is incompatible. activity2vec 1.0.0 requires torchvision==0.5.0, but you have torchvision 0.14.1 which is incompatible.

I'm trying to rebuild Alphapose and activity2vec that is run "cd AlphaPose && python setup.py build develop && cd .." but it show other errors,it seems need torch==1.4.0 and torchvision==0.5.0 It seems a endless loop...

I met the same problem with you.Have u figured it out? Please give me some advice.Thanks alot!

Yangless commented 1 month ago

(1) You can do a compilation of pytorch to adapt the code to the compute capability limits. (2) Or run the code on a device with an compute capability of 7.5 and below, it is recommended to use gcc-7.x for cuda compilation of the Activity2Ve module. I hope I was able to help.