georgeretsi / HTR-best-practices

Basic HTR concepts/modules to boost performance
13 stars 3 forks source link

operands could not be broadcast together with remapped #1

Closed MuhammadSaim7776 closed 1 year ago

MuhammadSaim7776 commented 1 year ago

i am using this code for my own dataset. While training i am getting this error

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2)

can you help me sort out this error?

georgeretsi commented 1 year ago

Can you please provide a more detailed description of the error. In which line of the code did it appear?

MuhammadSaim7776 commented 1 year ago

Traceback (most recent call last): File "/home/cle-dl-10/Desktop/SAIM/latest ocr/HTR-best-practices-main/train_htr.py", line 312, in train(epoch) File "/home/cle-dl-10/Desktop/SAIM/latest ocr/HTR-best-practices-main/train_htr.py", line 135, in train for iter_idx, (img, transcr) in enumerate(train_loader): File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 671, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/cle-dl-10/Desktop/SAIM/latest ocr/HTR-best-practices-main/utils/word_dataset.py", line 90, in getitem img = centered(img, (fheight, fwidth), border_value=0.0) File "/home/cle-dl-10/Desktop/SAIM/latest ocr/HTR-best-practices-main/utils/auxilary_functions.py", line 60, in centered word_img = np.pad(word_img[ys:ye, xs:xe], (padh, padw), 'constant', constant_values=border_value) File "<__array_function__ internals>", line 180, in pad File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/numpy/lib/arraypad.py", line 743, in pad pad_width = _as_pairs(pad_width, array.ndim, as_index=True) File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/numpy/lib/arraypad.py", line 518, in _as_pairs return np.broadcast_to(x, (ndim, 2)).tolist() File "<__array_function__ internals>", line 180, in broadcast_to File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/numpy/lib/stride_tricks.py", line 413, in broadcast_to return _broadcast_to(array, shape, subok=subok, readonly=True) File "/home/cle-dl-10/anaconda3/envs/pytorch/lib/python3.10/site-packages/numpy/lib/stride_tricks.py", line 349, in _broadcast_to it = np.nditer( ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2)

georgeretsi commented 1 year ago

From the function that raises the error (np.pad), you can safely assume that is a dimension missmatch. Most probably you used a colored image as a 3D tensor input, while I used grayscale images as 2D tensors.

MuhammadSaim7776 commented 1 year ago

I used opencv to read image and than convert image to grayscale. it worked for me. Thanks

georgeretsi commented 1 year ago

Even if that's the case, the image can be still represented as a 3D tensor (3 axes instead of 2). Check the dimensions of the loaded image and reshape it accordingly.

MuhammadSaim7776 commented 1 year ago

Can we use this code for Urdu language recognition? If we can use it. Kindly tell me the changes that i have to make

On Mon, Mar 6, 2023, 4:48 PM George Retsinas @.***> wrote:

Even if that's the case, the image can be still represented as a 3D tensor (3 axes instead of 2). Check the dimensions of the loaded image and reshape it accordingly.

— Reply to this email directly, view it on GitHub https://github.com/georgeretsi/HTR-best-practices/issues/1#issuecomment-1455989419, or unsubscribe https://github.com/notifications/unsubscribe-auth/APMRJCNSOTM6JT62JMYGSFLW2XFKHANCNFSM6AAAAAAVQW2IJ4 . You are receiving this because you authored the thread.Message ID: @.***>

georgeretsi commented 1 year ago

In theory, you can use it for Urdu. I am not familiar with the peculiarities of such scripts. For example if is a right-to-left language, you should reverse both final prediction sequence and target character sequence. I cannot help you with low-level language-specific adjustments.