jsvine / pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
MIT License
6.57k stars 659 forks source link

Identify contiguous rectangles with different fill colors (e.g. formatted table) as one Rectangle object #668

Closed moreproblems closed 2 years ago

moreproblems commented 2 years ago

When finding rectangles on a page, the package seems to mess up the bounding box when there is more than one fill color. Taking the example of the attached image, the bounding box will be that of the first row filled in grey (and the rest of the table will be ignored.

image

jsvine commented 2 years ago

Hi @moreproblems, and thanks for your interest in this library. Unfortunately, I'm having some trouble understanding the query here. Could you share the PDF itself and code that demonstrates your situation?

Closing this for now, as it's not clear there's a bug or practical feature request here, but we can continue the conversation.