Closed kento1109 closed 5 years ago
@kento1109 As mentioned in the README: "Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based."
Aside from being an image, the document you've attached is rotated. You can fix the rotation and try using OCR to extract data from this document.
Thank you for the quick replay! I noticed this pdf is based on the image when parsing by pdfminer.
First of all, I tried OCR to transform image to text data.
I want to extract DXA Results Summary table from PDF like this.
Sample_Dexa_Report.pdf
But, I cannot handle it..(Camelot warn that no tables found on page)
I tried both lattice and stream mode. But I cannot do well. How to extract table from this PDF ??