Closed sagar-kalburgi-ripcord closed 2 months ago
Also if it helps, the two documents that I attached to this ticket claim to be PDF/A-1a compliant. But when I try validating either of these documents using unipdf PDF/A-1a validation function, the validation fails with some errors. But again if I try to enforce PDF/A-1a standard on the merged document, the unicode gets messed up and I am unable to search for any text.
Page 1 of OCRed image0069.pdf Delta.pdf
Please find the two PDF files I used for the sample program
Hello @sagar-kalburgi-ripcord we have fixed your issue, it was merged into development branch of unipdf source repository (I believe you have access to it and can test it).
It will be included in the next release of UniPDF as well, we will let you know when it will be out.
Hi @anovik,
Thanks! Sure I will test it. Any idea when the next release of UniPDF is going to be?
@sagar-kalburgi-ripcord It is planned for the end of April.
@anovik I tested the fix from your development branch and it looks good! I'm afraid that end of April is late for us. This issue is critically impacting Production and we are losing revenue for every day that passes with this issue being active. Would it be possible for you to provide a hotfix release for this at the earliest possible?
@sagar-kalburgi-ripcord We completely understand the urgency of your situation and are prioritizing the release of a hotfix to address this issue as quickly as possible.
We'll keep you updated on the progress and notify you as soon as the release is completed.
ok thanks!
@sagar-kalburgi-ripcord The new release of UniPDF is available https://github.com/unidoc/unipdf/releases/tag/v3.57.0 and it includes this issue.
Closing the current ticket, feel free to re-open it in case of any problems.
Description
When I try merging two searchable PDF documents (produced by an OCR engine) using unipdf I am also specifying that PDF/1-a standard needs to be applied before writing the result of the merge to an output PDF file. But when I open the output PDF file, I see that the unicode is messed up because I am unable to search for any existing word on it, and when I copy some of the text on the output PDF and paste it on a notepad, I see this unrecognizable text:
ıˇ ˇ ı ˇı
ı
Expected Behavior
When I open the output PDF file and search for an existing word, it should show up. And when I copy the textual contents of the output PDF file and paste it on a notepad, it should paste the exact text that's present.
Actual Behavior
Steps to reproduce the behavior: Please run the below sample program with a valid Unidoc license and using the attached PDF files.
Attachments
2 PDF files have been attached