Several forms (Affidavit of Indigency) have the entire contents of the PDF inside a PDF Figure for some reason. This means pdfminer doesn't get any text.
This fixes that by using the all_text parameter to look in Figures, and by recursively unnesting the boxes in figures to get just the horizontal text lines.
Several forms (Affidavit of Indigency) have the entire contents of the PDF inside a PDF Figure for some reason. This means pdfminer doesn't get any text.
This fixes that by using the
all_text
parameter to look in Figures, and by recursively unnesting the boxes in figures to get just the horizontal text lines.Will merge after #91 is merged.