potterhsu / easy-faster-rcnn.pytorch

An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
MIT License
165 stars 57 forks source link

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` #33

Open simonefelicioni opened 3 years ago

simonefelicioni commented 3 years ago

Hi,

I'm using the pretrained model on coco2017 with resnet-101 backbone for inference, but sometimes I come across a strange CUDA error:

Traceback (most recent call last): File "C:\Users\Simone\anaconda3\envs\phd\lib\threading.py", line 932, in _bootstrap_inner self.run() File "C:\Users\Simone\PycharmProjects\Smart_Mapping\online_operations\main_thread.py", line 29, in run out_obj_det_img, bboxes, obj_classes, obj_prob = obj_det(img_tensor=img, steps=self.tot_path_steps, File "c:\users\simone\pycharmprojects\smart_mapping\faster-rcnn\infer.py", line 65, in obj_det model.eval().forward(image_tensor.unsqueeze(dim=0).cuda()) File "c:\users\simone\pycharmprojects\smart_mapping\faster-rcnn\model.py", line 62, in forward proposal_classes, proposal_transformers = self.detection.forward(features, proposal_bboxes) File "c:\users\simone\pycharmprojects\smart_mapping\faster-rcnn\model.py", line 111, in forward proposal_classes = self._proposal_class(hidden) File "C:\Users\Simone\anaconda3\envs\phd\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\Simone\anaconda3\envs\phd\lib\site-packages\torch\nn\modules\linear.py", line 93, in forward return F.linear(input, self.weight, self.bias) File "C:\Users\Simone\anaconda3\envs\phd\lib\site-packages\torch\nn\functional.py", line 1690, in linear ret = torch.addmm(bias, input, weight.t()) RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

It's very strange because sometimes this happens and then casually disappears. It seems to be a problem with the number of classes (as I saw here), but I don't know how to solve it.

Have you ever come across that? Do you know if it can be a problem with my CUDA or a problem with the model?

Thanks.