janis91 / ocr

Nextcloud OCR (optical character recoginition) processing for images with tesseract-js
GNU Affero General Public License v3.0
107 stars 17 forks source link

Preserve metadata of original image in created pdf document #287

Open stuckinger opened 2 years ago

stuckinger commented 2 years ago

Feature request

Expected Behavior

After successful ocr and creation of the pdf document the metadata from the original image like tags and comments appear as metadata of the new pdf.

Current Behavior

Metadata like tags and comments are not preserved and disappear.

Possible Solution

Read metadata from tagging and comments table from the old file id and add to the new file id.

Context

Tagging or commenting scanned images right ahead using easy image browsing in nextcloud web app before running ocr on multiple files. When trying to comment and/or tag a pdf file, easy arrow-key browsing is unavailable and preview might not be as useful as with the images.