Closed joesmith0 closed 2 years ago
Hi @joesmith0 Appreciate you using the library and raising a bug report. In order to further investigate this, request you to share a PDF that demonstrates this issue as well. Please remove any sensitive information from the PDF before sharing it.
Hi @samkit-jain @jsvine Is there any way to search keywords containing space e.x =['Team size','Company Name']. while using "page.extract_words()"
Hey @samkit-jain, unfortunately I wasn't able to get the redaction tool to work on my PDFs... Do you have any PDFs at hand with a mix of portrait and landscape pages? The characters should all be upright (same direction) despite the orientation.
Closing this issue due to inactivity, lack of issue-reproducing PDF, and lack of other users expressing similar issues. Feel free to continue the discussion, however, especially if someone comes across a PDF that allows us to reproduce.
Describe the bug
(In the context of PDF documents with a mix of landscape and portrait-oriented pages). On the landscape-oriented pages, the bounding boxes for highlight annotations (
x0, x1, top, bottom
) do not match the expected values for the words they should encapsulate.x0
can be negative in these cases which suggests the annotation thinks the page is in a portrait orientation.Code to reproduce the problem
Expected behavior
The bounding boxes of the words outputted above should be contained within the bounding box of the annotation.
Actual behavior
The bounding box of the annotation is completely off the target values (sometimes negative).
Environment