aws-samples / amazon-textract-response-parser

Parse JSON response of Amazon Textract
Apache License 2.0
212 stars 95 forks source link

Highlighted text get appended with word SELECTED in ouput #167

Open sawasume opened 9 months ago

sawasume commented 9 months ago

Textract response library appends the text SELECTED when something is highlighted in the text shown below are the example The original doc

doc_og

This is how the output looks like

selected_Capture

Code to generate the above output

code-print
schadem commented 9 months ago

Looks like checkbox/marked identification from checkbox model, which is part of TABLES and FORMS. Those are printed out as part of the rendering when available. No parameter right now to turn them off unfortunately. Workaround could be to filter the SELECTION_ELEMENTS out in the JSON before sending to trp, till we make a param available.