DS4SD / docling

Get your documents ready for gen AI
https://ds4sd.github.io/docling
MIT License
10.48k stars 507 forks source link

Bug #370

Closed patle22cute closed 3 days ago

patle22cute commented 3 days ago

Bug

PermissionError: [Errno 13] Permission denied: 'C:\Users\Dell\AppData\Local\Temp\tempxxxxx.png'

Steps to reproduce

unable to run Force full page OCR. in line: fp = builtins.open(filename, "w+b")

Docling version

2.5.2

Python version

3.11

dolfim-ibm commented 3 days ago

Can you please provide more details on which OCR engine you are using?

patle22cute commented 3 days ago

I use TesseractCliOcrOptions.

dolfim-ibm commented 3 days ago

This OCR engine requires writing temporary files to the disk. The files are saved in the temporary location provided by the Python tempfile module.

If your user doesn't have write permissions in that folder, you might be able to provide another location via the TMPDIR environment variable. See the documentation of the tempfile.gettempdir() function.

patle22cute commented 3 days ago

thanks for your response, I've checked the permission carefully, the error seems that this .png file is being opened by a function, I've tried adding: if os.path.exists(fname): os.remove(fname) before the save step, the program runs normally. And let me ask more, is there any support for detecting horizontal rotation for this version 2.5.2 and how to add train language file. thank @dolfim-ibm .

dolfim-ibm commented 3 days ago

Good, happy you could solve your issue.

Regarding your questions:

  1. OCR train language should be addressed with the OCR engines.
  2. Rotation is captured for programmatic documents, but for scanned one, we rely on the OCR engine being able to capture it.