mindee / doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
https://mindee.github.io/doctr/
Apache License 2.0
3.83k stars 436 forks source link

GPU inference error using db_resne50_rotation #1020

Closed harindercnvrg closed 2 years ago

harindercnvrg commented 2 years ago

Bug description

Hello, guys I am getting the following error by running the snipped of code. I don't get the same error if I use db_resnet50 instead of db_resnet50_rotation. Help please! I am using the version 0.5.1

Code snippet to reproduce the bug

from doctr.io import DocumentFile
from doctr.models import ocr_predictor
tic = time.time()
doc = DocumentFile.from_pdf("/content/file.pdf")
print(f"Number of pages: {len(doc)}")
predictor = ocr_predictor('db_resnet50_rotation','crnn_vgg16_bn',pretrained=True,assume_straight_pages=False).cuda()
result = predictor(doc)

Error traceback

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

Environment

!pip install python-doctr[torch]
!sudo apt-get install fonts-freefont-ttf -y
!pip3 install --no-build-isolation -U pypdfium2==1.0.0

Deep Learning backend

is_tf_available: False
is_torch_available: True
frgfm commented 2 years ago

Hello @harindercnvrg :wave:

Let's try to get you some help!

First, could you elaborate on "I don't get the same error if I use db_resnet50 instead of db_resnet50_rotation". What error do you get if you use that one?

What is going on here is that your model is on GPU and your input documents (and images) are not. TensorFlow moves all tensors to GPU whenever it's available by default (whether you ask for it or not), but PyTorch doesn't.

However, we have changed this to align the behaviour between PT & TF (cf. https://github.com/mindee/doctr/blob/main/doctr/models/detection/predictor/pytorch.py#L47-L51). So I'm surprised that you're getting this error. So could you share the full error traceback please? :pray: (perhaps it's not the detection part that causes this error :thinking:)

Cheers :v:

harindercnvrg commented 2 years ago

First, could you elaborate on "I don't get the same error if I use db_resnet50 instead of db_resnet50_rotation". What error do you get if you use that one?

Yes, please look below: I do not get an error when I run the following code:

import os
import torch
# Let's pick the desired backend
# os.environ['USE_TF'] = '1'
os.environ['USE_TORCH'] = '1'
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

from doctr.io import DocumentFile
from doctr.models import ocr_predictor
tic = time.time()
doc = DocumentFile.from_pdf("/content/file.pdf")
print(f"Number of pages: {len(doc)}")
predictor = ocr_predictor('db_resnet50','crnn_vgg16_bn',pretrained=True,assume_straight_pages=False).cuda()
result = predictor(doc)

However, I get an error when I run the following code:

import os
import torch
# Let's pick the desired backend
# os.environ['USE_TF'] = '1'
os.environ['USE_TORCH'] = '1'
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

from doctr.io import DocumentFile
from doctr.models import ocr_predictor
tic = time.time()
doc = DocumentFile.from_pdf("/content/file.pdf")
print(f"Number of pages: {len(doc)}")
predictor = ocr_predictor('db_resnet50_rotation','crnn_vgg16_bn',pretrained=True,assume_straight_pages=False).cuda()
result = predictor(doc)

The difference is the detection model.

Error traceback

RuntimeError                              Traceback (most recent call last)
[<ipython-input-13-1971e6535e74>](https://localhost:8080/#) in <module>
     17 print(f"Number of pages: {len(doc)}")
     18 predictor = ocr_predictor('db_resnet50_rotation','crnn_vgg16_bn',pretrained=True,assume_straight_pages=False).cuda()
---> 19 result = predictor(doc)
     20 toc = time.time()
     21 print("time taken: ", toc - tic)

18 frames
[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py](https://localhost:8080/#) in decorate_context(*args, **kwargs)
     25         def decorate_context(*args, **kwargs):
     26             with self.clone():
---> 27                 return func(*args, **kwargs)
     28         return cast(F, decorate_context)
     29 

[/usr/local/lib/python3.7/dist-packages/doctr/models/predictor/pytorch.py](https://localhost:8080/#) in forward(self, pages, **kwargs)
     85         # Rectify crop orientation
     86         if not self.assume_straight_pages:
---> 87             crops, loc_preds = self._rectify_crops(crops, loc_preds)
     88         # Identify character sequences
     89         word_preds = self.reco_predictor([crop for page_crops in crops for crop in page_crops], **kwargs)

[/usr/local/lib/python3.7/dist-packages/doctr/models/predictor/base.py](https://localhost:8080/#) in _rectify_crops(self, crops, loc_preds)
     90     ) -> Tuple[List[List[np.ndarray]], List[np.ndarray]]:
     91         # Work at a page level
---> 92         orientations = [self.crop_orientation_predictor(page_crops) for page_crops in crops]  # type: ignore[misc]
     93         rect_crops = [rectify_crops(page_crops, orientation) for page_crops, orientation in zip(crops, orientations)]
     94         rect_loc_preds = [

[/usr/local/lib/python3.7/dist-packages/doctr/models/predictor/base.py](https://localhost:8080/#) in <listcomp>(.0)
     90     ) -> Tuple[List[List[np.ndarray]], List[np.ndarray]]:
     91         # Work at a page level
---> 92         orientations = [self.crop_orientation_predictor(page_crops) for page_crops in crops]  # type: ignore[misc]
     93         rect_crops = [rectify_crops(page_crops, orientation) for page_crops, orientation in zip(crops, orientations)]
     94         rect_loc_preds = [

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py](https://localhost:8080/#) in decorate_context(*args, **kwargs)
     25         def decorate_context(*args, **kwargs):
     26             with self.clone():
---> 27                 return func(*args, **kwargs)
     28         return cast(F, decorate_context)
     29 

[/usr/local/lib/python3.7/dist-packages/doctr/models/classification/predictor/pytorch.py](https://localhost:8080/#) in forward(self, crops)
     47         predicted_batches = [
     48             self.model(batch)
---> 49             for batch in processed_batches
     50         ]
     51 

[/usr/local/lib/python3.7/dist-packages/doctr/models/classification/predictor/pytorch.py](https://localhost:8080/#) in <listcomp>(.0)
     47         predicted_batches = [
     48             self.model(batch)
---> 49             for batch in processed_batches
     50         ]
     51 

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torchvision/models/mobilenetv3.py](https://localhost:8080/#) in forward(self, x)
    234 
    235     def forward(self, x: Tensor) -> Tensor:
--> 236         return self._forward_impl(x)
    237 
    238 

[/usr/local/lib/python3.7/dist-packages/torchvision/models/mobilenetv3.py](https://localhost:8080/#) in _forward_impl(self, x)
    224 
    225     def _forward_impl(self, x: Tensor) -> Tensor:
--> 226         x = self.features(x)
    227 
    228         x = self.avgpool(x)

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py](https://localhost:8080/#) in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py](https://localhost:8080/#) in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py](https://localhost:8080/#) in forward(self, input)
    455 
    456     def forward(self, input: Tensor) -> Tensor:
--> 457         return self._conv_forward(input, self.weight, self.bias)
    458 
    459 class Conv3d(_ConvNd):

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py](https://localhost:8080/#) in _conv_forward(self, input, weight, bias)
    452                             _pair(0), self.dilation, self.groups)
    453         return F.conv2d(input, weight, bias, self.stride,
--> 454                         self.padding, self.dilation, self.groups)
    455 
    456     def forward(self, input: Tensor) -> Tensor:

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
frgfm commented 2 years ago

Thanks for that! The traceback gave the source of the problem away: the input tensor isn't being moved to the device of the model as it should.

I'll fix this shortly then :) I'll let you know here when it's merged!

felixdittrich92 commented 2 years ago

Hi @harindercnvrg should work now (on main) 👋

minouei-kl commented 1 year ago

Hi @felixdittrich92 I'm getting a similar error using db_resne50_rotation with version 0.7.1a0. it is working fine with db_resne50. Error traceback

Traceback (most recent call last):
  File "ocr.py", line 42, in <module>
    result = model(image)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/minouei/Documents/models/rvl/gits/doctr/doctr/models/predictor/pytorch.py", line 117, in forward
    crops, loc_preds = self._rectify_crops(crops, loc_preds)
  File "/home/minouei/Documents/models/rvl/gits/doctr/doctr/models/predictor/base.py", line 93, in _rectify_crops
    orientations = [self.crop_orientation_predictor(page_crops) for page_crops in crops]  # type: ignore[misc]
  File "/home/minouei/Documents/models/rvl/gits/doctr/doctr/models/predictor/base.py", line 93, in <listcomp>
    orientations = [self.crop_orientation_predictor(page_crops) for page_crops in crops]  # type: ignore[misc]
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/minouei/Documents/models/rvl/gits/doctr/doctr/models/classification/predictor/pytorch.py", line 48, in forward
    predicted_batches = [self.model(batch).to(device=_device) for batch in processed_batches]
  File "/home/minouei/Documents/models/rvl/gits/doctr/doctr/models/classification/predictor/pytorch.py", line 48, in <listcomp>
    predicted_batches = [self.model(batch).to(device=_device) for batch in processed_batches]
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torchvision/models/mobilenetv3.py", line 179, in forward
    return self._forward_impl(x)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torchvision/models/mobilenetv3.py", line 169, in _forward_impl
    x = self.features(x)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/minouei/miniconda3/envs/mmocr3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

I had to change this line:


        _device = next(self.model.parameters()).device
        predicted_batches = [self.model(batch.to(device=_device)).to(device=_device) for batch in processed_batches]