Closed mugiwara85 closed 3 years ago
Hi @mugiwara85 Appreciate your interest in the library. If you have the coordinates of the rectangles, you can use the page.crop(...)
method to crop the page and then run .extract_text(...)
on the cropped page which will give you the text inside the rectangle.
Hi!
I have a pdf (I can't share it unfortunately). It contains multiple rectangles in cascaded style. Something like this:
In each rectangle is the text I need. I can find all rectangles on a given page like this:
for page in range(pagecount): current_page = pdf.pages[page] print ("rectangles=", current_page.rects)
But how can I extract the text from them? extract_text() extracts text from the whole page, but I just need from the rectangles.
Thanks in advance!