JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Apache License 2.0
23.76k stars 3.11k forks source link

Not been able to run easyocr in spark udf #764

Open patilauminfi opened 2 years ago

patilauminfi commented 2 years ago

Hi, I am trying to run the easyocr in spark udf function. (Actually there is also another pandas apply function in that udf function). When i run the script I get following error.

Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/tmp/ipykernel_4537/530021952.py", line 107, in get_parsed_output
  File "/home/centos/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 8839, in apply
    return op.apply().__finalize__(self, method="apply")
  File "/home/centos/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 727, in apply
    return self.apply_standard()
  File "/home/centos/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 851, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/home/centos/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 871, in apply_series_generator
    results[i] = results[i].copy(deep=False)
  File "/home/centos/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 138, in f
    return func(x, *args, **kwargs)
  File "/tmp/ipykernel_4537/1144331222.py", line 32, in crop_save
  File "/home/centos/.local/lib/python3.8/site-packages/easyocr/easyocr.py", line 400, in readtext
    result = self.recognize(img_cv_grey, horizontal_list, free_list,\
  File "/home/centos/.local/lib/python3.8/site-packages/easyocr/easyocr.py", line 330, in recognize
    result0 = get_text(self.character, imgH, int(max_width), self.recognizer, self.converter, image_list,\
  File "/home/centos/.local/lib/python3.8/site-packages/easyocr/recognition.py", line 206, in get_text
    result1 = recognizer_predict(recognizer, converter, test_loader,batch_max_length,\
  File "/home/centos/.local/lib/python3.8/site-packages/easyocr/recognition.py", line 101, in recognizer_predict
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1751, in eval
    return self.train(False)
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1732, in train
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1732, in train
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1732, in train
  [Previous line repeated 1 more time]
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1731, in train
    for module in self.children():
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1618, in children
    for name, module in self.named_children():
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1636, in named_children
    for name, module in self._modules.items():
  File "/home/centos/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LinearPackedParams' object has no attribute '_modules'

I am using Centos 7 Python 3.8 torch version - 1.11.0+cu102 torchvision version - 0.12.0+cu102

Can anyone pls help?

s39674 commented 2 years ago

Hi @patilauminfi Do you have any reproducing code that I can run?

patilauminfi commented 2 years ago

Hi @s39674 When I run following function in spark udf function it throws this error:

import easyocr
from io import BytesIO
reader = easyocr.Reader(['en'])

def crop_save(img):

    return txt

You can take following file for reference.

U05A11P10 pdf_12_13 `

zba18 commented 2 years ago

I get the same error

kristenfed commented 1 year ago

Hi @patilauminfi I recently faced the same problem. Only when i set parameter quantize=False for Reader, it worked for me.