aws-samples / amazon-textract-textractor

Analyze documents with Amazon Textract and generate output in multiple formats.
Apache License 2.0
360 stars 134 forks source link

cell content extraction error #355

Open Larbo53 opened 2 months ago

Larbo53 commented 2 months ago

good morning,

what solution do I use with textractor to extract the cell data from the attached image and render the cell rows correctly in Excel? Is there a rows component in a cell?

thank you for your feedback.

Capture d’écran 2024-04-14 à 18 21 29 Capture d’écran 2024-04-14 à 18 20 01
Belval commented 1 month ago

Sorry for the late reply on this, could you provide the document.visualize() output for the above? I believe the 5 lines are being identified as a single row.

Larbo53 commented 1 month ago

good morning,

here are the source files (image.png) , the document document.visualize() and the resulting xl file.

Thank you very much.

sincerely image out(sheet1).xlsx tmp9hpjdpdt