hrishikeshrt / google_drive_ocr

Perform OCR using Google's Drive API v3
Other
37 stars 10 forks source link

Runtime error #2

Closed defencedog closed 2 years ago

defencedog commented 2 years ago

Description

Installed via Source Runtime error... Got x3 separate pages in my dir though ...

What I Did

~ $ google-ocr --config /storage/28E1-1BF4/GoogleOCR/gdo.cfg --pdf /storage/emulated/0/Download/In.pdf
Traceback (most recent call last):
  File "/data/data/com.termux/files/usr/bin/google-ocr", line 33, in <module>
    sys.exit(load_entry_point('google-drive-ocr==0.2.5', 'console_scripts', 'google-ocr')())
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/google_drive_ocr/cli.py", line 153, in main
    app.perform_ocr_batch(
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/google_drive_ocr/application.py", line 266, in perform_ocr_batch
    _workload = workload + (idx < extra)
UnboundLocalError: local variable 'extra' referenced before assignment
hrishikeshrt commented 2 years ago

This is the same error as mentioned in #1 Please make sure you have installed the latest version. You may have to run pip install --upgrade google-drive-ocr to get the latest version (0.2.5+)

defencedog commented 2 years ago

I have built via git clone... See the version is written above in the Runtime error

hrishikeshrt commented 2 years ago

Ah, my bad. I see that you have 0.2.5. Although, in that case, I am unable to understand or reproduce the issue. The variable extra does get set before it's accessed. Could you paste the output of following,

grep extra data/data/com.termux/files/usr/lib/python3.10/site-packages/google_drive_ocr/application.py -C 5
defencedog commented 2 years ago

Here

.../site-packages/google_drive_ocr $ grep extra application.py -C 5                                        workload = file_count // workers
        if workers > 1:
            print(f"Total {file_count} files "
                  f"distributed among {workers} workers.")
            print(f"Workload: {workload}-{workload + 1} per worker")
            extra = file_count % workers

        worker_arguments = []
        _start = 0
        for idx in range(workers):
            _workload = workload + (idx < extra)
            worker_arguments.append({
                'worker_id': idx,
                'image_files': image_files[_start:_start+_workload],
                'disable_tqdm': disable_tqdm
            })
hrishikeshrt commented 2 years ago

This seems to be the older version of file still. Check out the current version of the file in the repository.

https://github.com/hrishikeshrt/google_drive_ocr/blob/32dc57cf2ff38f7b7b5bc6884074c91a167ba3af/google_drive_ocr/application.py#L256

Perhaps an uninstall and re-install of the package would fix it.

defencedog commented 2 years ago

Dear author I am literally sorry ...It's weird that today I upgraded pip & problem solved

google-ocr --config /storage/28E1-1BF4/GoogleOCR/gdo.cfg --pdf in.pdf
(PID:26799): 100%|███████████████████████████████████████████████████| 3/3 [00:20<00:00,  6.91s/it]
Total Time Taken: 20.81 seconds
hrishikeshrt commented 2 years ago

No need to be sorry :-) I am glad the issue is resolved.