Open raidken opened 2 months ago
59766-textract-table.json In the Textract output file Cell id 3f98227c-2981-4cd5-b23c-bee82e96bb54 references three words but the code below returns null words in that cell.
document= Document.open("c:\temp\59766-textract-table.json")
line_list =list(filter(lambda line: line.id=="3f98227c-2981-4cd5-b23c-bee82e96bb54",document.pages[6].lines)) print (line_list[0].words)
table_n = document.pages[6].tables[1]
for cell in table_n.table_cells: if cell.id=="c23b7b9e-7b90-42d4-ad94-41caa8931417": print(cell.words)
I am able to reproduce the issue, could you provide the original document for that response? It would make it easier to troubleshoot.
59766-textract-table.json In the Textract output file Cell id 3f98227c-2981-4cd5-b23c-bee82e96bb54 references three words but the code below returns null words in that cell.
document= Document.open("c:\temp\59766-textract-table.json")
query for the line id that references that same three words
for line in document.pages[6].lines:
line_list =list(filter(lambda line: line.id=="3f98227c-2981-4cd5-b23c-bee82e96bb54",document.pages[6].lines)) print (line_list[0].words)
return the three words [Operating, Segment, Information]
cell in the textract output references the same three words but the words or text returns null, incorrectly, for the cell.
table_n = document.pages[6].tables[1]
find cell and output words
for cell in table_n.table_cells: if cell.id=="c23b7b9e-7b90-42d4-ad94-41caa8931417": print(cell.words)
return null