aws-samples / amazon-textract-code-samples

Amazon Textract Code Samples
MIT No Attribution
416 stars 260 forks source link

Unable to parse Document result in Python #39

Open anking opened 1 year ago

anking commented 1 year ago

using textract-trp 0.1.3

When parsing "get_document_analysis" response the following output is generated:

Traceback (most recent call last):
  File "G:\dev\OCR\main.py", line 17, in <module>
    result = (textract.receive_document_result('52c4a450c667a18d89f4e26a1cf4b56859ad239f1a63279bec8f60458ae2284e'))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\textract.py", line 62, in receive_document_result
    return Document(response)
           ^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 633, in __init__
    self._parse()
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 667, in _parse
    page = Page(documentPage["Blocks"], self._blockMap)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 516, in __init__
    self._parse(blockMap)
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 530, in _parse
    l = Line(item, blockMap)
        ^^^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 142, in __init__
    if(blockMap[cid]["BlockType"] == "WORD"):
       ~~~~~~~~^^^^^
KeyError: '9e2f5e38-f865-4b79-a37b-ac8ed7a19f02'