roatienza / deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Apache License 2.0
284 stars 57 forks source link

Rand Aug #31

Open fmobrj opened 1 year ago

fmobrj commented 1 year ago

Hello @roatienza!

Thanks for this great repo!

I am trying to train using rand_aug but I am facing some issues. I get an error on blur.py when trying to convert from BGR to Grayscale. It seems the image has just one channel.

`error: Caught error in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/fmobrj/anaconda3/envs/vitstr/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/fmobrj/anaconda3/envs/vitstr/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/media/hdd6tb/jupyter/notebooks/vitstr/deep-text-recognition-benchmark/dataset.py", line 500, in __call__
    image_tensors = [transform(image) for image in images]
  File "/media/hdd6tb/jupyter/notebooks/vitstr/deep-text-recognition-benchmark/dataset.py", line 500, in <listcomp>
    image_tensors = [transform(image) for image in images]
  File "/media/hdd6tb/jupyter/notebooks/vitstr/deep-text-recognition-benchmark/dataset.py", line 336, in __call__
    img = self.rand_aug(img)
  File "/media/hdd6tb/jupyter/notebooks/vitstr/deep-text-recognition-benchmark/dataset.py", line 357, in rand_aug
    img = op(img, mag=mag)
  File "/media/hdd6tb/jupyter/notebooks/vitstr/deep-text-recognition-benchmark/augmentation/blur.py", line 104, in __call__
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(3.4.18) /io/opencv/modules/imgproc/src/color.simd_helpers.hpp:88: error: (-2:Unspecified error) in function 'cv::impl::{anonymous}::CvtHelper<VScn, VDcn, VDepth, sizePolicy>::CvtHelper(cv::InputArray, cv::OutputArray, int) [with VScn = cv::impl::{anonymous}::Set<3, 4>; VDcn = cv::impl::{anonymous}::Set<3, 4>; VDepth = cv::impl::{anonymous}::Set<0, 2, 5>; cv::impl::{anonymous}::SizePolicy sizePolicy = cv::impl::<unnamed>::NONE; cv::InputArray = const cv::_InputArray&; cv::OutputArray = const cv::_OutputArray&]'
> Invalid number of channels in input image:
>     'VScn::contains(scn)'
> where
>     'scn' is 1`
roatienza commented 1 year ago

Pls try it with a color image and see if there is still an error. AFAIK, the function was tested for color images only.