Closed lilygrier closed 4 years ago
Hi @lilygrier, and thanks for your interest in this library. Cropping the page in half sounds like a reasonable approach. If the page is truly split down the exact middle, you should be able to determine the location via page.width // 2
. If it's not exactly in the middle, hopefully it's in a fairly consistent place, in which case you could hardcode that x-value.
My apologies if this is addressed elsewhere (somewhat new to deciphering documentation). I'm working with PDFs like these (http://www.fao.org/ag/locusts/common/ecg/2536/en/DL498e.pdf) that have text across two columns. When I try to extract text, it's blurring the columns into one. Would the solution be to crop the page down the middle and read in each side separately? If so, how would I determine the location of the middle of the page? Thanks so much!