Closed LinXin2018 closed 6 years ago
Can you provide more details?
Maybe you can try training the model with pretrained RPN --rpn /path/to/rpn
Hello!! author:
This error first occured when I try to evaluate the model with the pretrained model. I've add --rpn option, but it does not work!
I also tried to train the model but the same error occured! Right after start training, it reported THCudaCheck FAIL file=/pytorch/torch/lib/THC/THCTensorCopy.cu line=204 error=8 : invalid device function
and in engines_v1.py it jumps to Exception handle.
It seems to the roialign function does not work well.
Thank You!
Yes, it is because that VG-DR-Net
has some self-relations, e.g. A-relation-A. Please use git pull
to update the latest version. Please check the Updates
in README for more information.
If it works, please comment here and I will close the issue.
Thank you for update the code version. However, unfortunatelly the CUDA runtime error(8) still remains! Maybe because pytorch 0.3.1+python2.7 does not fit GTX1080Ti architecture (cuda8.0). I will try to run the code under pytorch 0.4.x
Sorry, I haven't met this issues before. Maybe you can try the evalaution
and training
on different datasets to check whether it happens only the specific or any settings. Hope to hear more about the issue.
Hello!Today, I have tried to run evaluation on DR dataset and tried to train RPN network, the same error exactly occurred! I think it's the cuda environment issue not the code having bugs. LinXin
I am so sorry to hear that.
I think you can use pdb
to track there the bug happened. So we can check it is because of the settings of the code or the general configuration of your server.
Looking forward to any updates.
Hello, I have debug the code beffore, it jumps to engines_v1.py/line86 exception handle.
It ends at the line 86 because we have an exception catch there. That is not the position where the error happened. I highly recommend you to use pdb
to run the code step by step to check where it actually happens.
Now I will close the issue. Feel free to leave comments at this thread.
I had the same error. It might be related to a CUDA version mismatch (at least in my case) as the pytorch installed via pip isn't compiled using the latest version. Were you able to solve this?
I had the same error. It might be related to a CUDA version mismatch (at least in my case) as the pytorch installed via pip isn't compiled using the latest version. Were you able to solve this?
Not yet.
So in my case, my graphics card uses CUDA compute capability 7.0 and PyTorch 3.x which this project requires I'm assuming isn't compatible with that level. If it helps to debug your situation, I moved to a computer with a Titan X (CUDA 8.0, capability 6.1) and it worked fine using the instructions in the readme.
Thank you for update the code version. However, unfortunatelly the CUDA runtime error(8) still remains! Maybe because pytorch 0.3.1+python2.7 does not fit GTX1080Ti architecture (cuda8.0). I will try to run the code under pytorch 0.4.x
Imeet the same problem with pytorch 0.3.1+python3.6
Hello author: When I trying to run your code, it rports:
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCTensorMath.cu line=35 error=8 : invalid device function Traceback (most recent call last): File "train_FN.py", line 390, in <module> main() File "train_FN.py", line 277, in main use_gt_boxes=args.use_gt_boxes) File "/home/linxin/FactorizableNet/models/HDN_v2/engines_v1.py", line 123, in test use_gt_boxes=use_gt_boxes) File "/home/linxin/FactorizableNet/models/HDN_v2/factorizable_network_v4.py", line 271, in evaluate object_result, predicate_result = self.forward_eval(im_data, im_info,) File "/home/linxin/FactorizableNet/models/HDN_v2/factorizable_network_v4.py", line 232, in forward_eval pooled_object_features = self.roi_pool_object(features, object_rois).view(len(object_rois), -1) File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in __call__ result = self.forward(*input, **kwargs) File "/home/linxin/FactorizableNet/lib/roi_align/modules/roi_align.py", line 16, in forward self.spatial_scale)(features, rois) File "/home/linxin/FactorizableNet/lib/roi_align/functions/roi_align.py", line 22, in forward output = features.new(num_rois, num_channels, self.aligned_height, self.aligned_width).zero_() RuntimeError: cuda runtime error (8) : invalid device function at /pytorch/torch/lib/THC/generic/THCTensorMath.cu:35
I have changed the lib/make.sh file since my CUDA_ARCH do not support 6.0. The make.sh seems to work for me, only having a few warning. After reading [I](https://github.com/jwyang/faster-rcnn.pytorch/issues/110
) have re-built the make.sh for a few times, the cuda error does not overcomed.
/home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c: In function ‘BilinearSamplerBHWD_updateGradInput’: /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:190:14: warning: unused variable ‘inBottomRight’ [-Wunused-variable] real inBottomRight=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:189:14: warning: unused variable ‘inBottomLeft’ [-Wunused-variable] real inBottomLeft=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:188:14: warning: unused variable ‘inTopRight’ [-Wunused-variable] real inTopRight=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:187:14: warning: unused variable ‘inTopLeft’ [-Wunused-variable] real inTopLeft=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:186:14: warning: unused variable ‘v’ [-Wunused-variable] real v=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c: In function ‘BilinearSamplerBCHW_updateGradInput’: /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:441:14: warning: unused variable ‘inBottomRight’ [-Wunused-variable] real inBottomRight=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:440:14: warning: unused variable ‘inBottomLeft’ [-Wunused-variable] real inBottomLeft=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:439:14: warning: unused variable ‘inTopRight’ [-Wunused-variable] real inTopRight=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:438:14: warning: unused variable ‘inTopLeft’ [-Wunused-variable] real inTopLeft=0; ^ /home/linxin/FactorizableNet/lib/roi_crop/src/roi_crop.c:437:14: warning: unused variable ‘v’ [-Wunused-variable] real v=0; ^
My environment is CUDA8.0 pytorch0.3.1 python2.7
Hope to recieve your reply!
THX