JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
24.12k stars 3.13k forks source link

Bug in craft training in make_char_box.py #1166

Open connorourke opened 11 months ago

connorourke commented 11 months ago

I am hitting a divide by zero exception in make_char_box.py when training the craft recognition model where it tries to crop an image by the bounding box, but the bounding box has an area of zero.

Stack trace below:

> /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/data/pseudo_label/make_charbox.py(36)crop_image_by_bbox()
     35 
---> 36         one_char_ratio = min(h, w) / (max(h, w) / len(word))
     37 

ipdb> w
  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/train.py(479)<module>()
    477 
    478 if __name__ == "__main__":
--> 479     main()

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/train.py(472)main()
    471     trainer = Trainer(config, 0, mode)
--> 472     trainer.train(buffer_dict)
    473 

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/train.py(239)train()
    238         while train_step < whole_training_step:
--> 239             for (
    240                     index,

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py(630)__next__()
    629                 self._reset()  # type: ignore[call-arg]
--> 630             data = self._next_data()
    631             self._num_yielded += 1

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py(674)_next_data()
    673         index = self._next_index()  # may raise StopIteration
--> 674         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    675         if self._pin_memory:

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/.venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py(51)fetch()
     50             else:
---> 51                 data = [self.dataset[idx] for idx in possibly_batched_index]
     52         else:

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/.venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py(51)<listcomp>()
     50             else:
---> 51                 data = [self.dataset[idx] for idx in possibly_batched_index]
     52         else:

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/data/dataset.py(150)__getitem__()
    149                 words,
--> 150             ) = self.make_gt_score(index)
    151         else:

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/data/dataset.py(457)make_gt_score()
    456             horizontal_text_bools,
--> 457         ) = self.load_data(index)
    458         img_h, img_w, _ = image.shape

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/data/dataset.py(424)load_data()
    423                 horizontal_text_bool,
--> 424             ) = self.pseudo_charbox_builder.build_char_box(
    425                 self.net, self.gpu, image, word_bboxes[i], words[i], img_name=img_name

  /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/data/pseudo_label/make_charbox.py(209)build_char_box()
    208     def build_char_box(self, net, gpu, image, word_bbox, word, img_name=""):
--> 209         word_image, M, horizontal_text_bool = self.crop_image_by_bbox(
    210             image, word_bbox, word

> /home/connoourke/bin/src/EasyOCRDev/EasyOCR/trainer/craft/data/pseudo_label/make_charbox.py(36)crop_image_by_bbox()
     35 
---> 36         one_char_ratio = min(h, w) / (max(h, w) / len(word))
     37 

ipdb> a
self = <data.pseudo_label.make_charbox.PseudoCharBoxBuilder object at 0x7f68ded91030>
image = array([[[125, 141, 166],
        [153, 170, 196],
        [182, 201, 231],
        ...,
        [227, 247, 254],
        [227, 247, 254],
        [227, 247, 254]],

       [[140, 156, 181],
        [154, 171, 197],
        [168, 187, 217],
        ...,
        [227, 247, 254],
        [227, 247, 254],
        [227, 247, 254]],

       [[155, 171, 197],
        [157, 174, 200],
        [157, 176, 206],
        ...,
        [228, 248, 255],
        [227, 247, 254],
        [227, 247, 254]],

       ...,

       [[ 57,  71,  71],
        [ 55,  67,  65],
        [ 57,  67,  66],
        ...,
        [190, 182, 159],
        [191, 183, 160],
        [205, 197, 174]],

       [[ 56,  72,  72],
        [ 60,  76,  75],
        [ 79,  93,  93],
        ...,
        [142, 130, 108],
        [152, 140, 118],
        [197, 185, 163]],

       [[ 84, 102, 102],
        [103, 121, 121],
        [139, 155, 154],
        ...,
        [119, 104,  83],
        [119, 104,  83],
        [166, 151, 130]]], dtype=uint8)
box = array([[322., 506.],
       [322., 506.],
       [322., 506.],
       [322., 506.]], dtype=float32)
word = 'af'
ipdb> 
EivindKjosbakken commented 8 months ago

I too encountered a similar issue. My problem was the formatting of my bounding box coordinates which should be in the format (x1,y1,x2,y2,x3,y3,x4,y4) with 1 being the top left corner, 2 being the top right corner, 3 being the bottom right corner and 4 being the bottom left corner. Make sure you have entered the bounding box coordinates in the correct format. If you are interested, I also wrote an article on fine-tuning the CRAFT model on TowardsAI https://medium.com/towards-artificial-intelligence/how-to-fine-tune-the-craft-model-in-easyocr-f9fa0ac5cc9d, which can hopefully be of further help.