issues
search
aws-samples
/
amazon-textract-textractor
Analyze documents with Amazon Textract and generate output in multiple formats.
Apache License 2.0
359
stars
134
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Save image doesn't work with S3 path - TypeError: Invalid input type 'bytearray'
#382
steffeng
closed
5 days ago
3
Fix .to_markdown() raising an exception on missing local config
#381
Belval
closed
1 week ago
0
issue regarding .to_markdown() method
#380
red-sky17
opened
1 week ago
2
fix(expense): Expenses with no summary fields
#379
athewsey
closed
1 week ago
0
Replace region mismatch with invalid S3 object
#378
Belval
closed
1 week ago
0
Improve error message that identified InvalidS3ObjectException as RegionMismatch
#377
Belval
closed
1 week ago
0
Use pypdfium2 for PDF rasterizing when possible
#376
Belval
closed
1 week ago
0
Allow PDF in for DetectDocumentText and AnalyzeDocument
#375
Belval
closed
1 week ago
0
Improve HTML linearization
#374
Belval
closed
1 week ago
0
Lambda layers for Python 3.12 PDF raising an exception on missing libpng16.so.16
#373
Viajante80
opened
1 week ago
3
Lambda layers for Python 3.12 raising an exception on missing libopenjp2.so.7
#372
Belval
closed
1 week ago
0
Is search_words() broken?
#371
ttruong-gilead
opened
2 weeks ago
0
Empty expense_documents on analyze_expense
#370
arsher-b
closed
1 week ago
3
Incorrect table cell word and line order
#369
wessens
opened
1 month ago
2
'NoneType' object has no attribute 'spatial_object' on Expense Analysis results
#368
HarryTSaban
opened
1 month ago
0
Use module name for logger instead of Root Logger
#367
michaelshum321
opened
1 month ago
0
pdf2image is required even though save_image=False
#366
vdefeo-caylent
opened
1 month ago
0
prefix and suffix for footer layout is not available
#365
LeoHemamou
opened
1 month ago
0
Exception handling is hiding the underlying issue of the error.
#364
vdefeo-caylent
opened
1 month ago
2
Add confidence scores at the DocumentEntity level
#363
Belval
closed
1 month ago
0
Add figure layout prefix and suffix
#362
Belval
closed
1 month ago
0
feature request: add query alias parameter
#361
parad0x96
closed
1 month ago
2
Fix missing figures
#360
Belval
closed
2 months ago
0
Access Non-Axis-Aligned Bounding Boxes
#359
zkalson
opened
2 months ago
2
Table cell, incorrectly, does not pick up the cell text/words. Page--> Line picks up the words as in the textract output
#358
raidken
opened
2 months ago
1
Update function doc and return type
#357
andrewkowalik
closed
1 week ago
1
issue with extraction, get_text_fromlayout_json function
#356
red-sky17
opened
2 months ago
1
cell content extraction error
#355
Larbo53
opened
2 months ago
2
Use AWS_REGION and AWS_DEFAULT_REGION environment variables in Textractor when available
#354
Belval
closed
2 months ago
0
[Feature Request] Simplified batch processing CLI
#353
athewsey
opened
2 months ago
1
Cryptic CLI error in SageMaker Studio (and probably other role-based environments?)
#352
athewsey
opened
2 months ago
1
Python Support for Column Headers
#351
Belval
opened
2 months ago
0
ensure cell block has text element
#349
qeternity
closed
1 month ago
4
KeyError in get_lines_string
#348
sbui-dev
opened
2 months ago
0
Exporting text+tables while maintaining layout
#347
austinmw
opened
2 months ago
1
Fixes issue #345 : S3 path parser
#346
anjanvb
closed
3 months ago
0
S3 path parsing for textractcaller is not robust enough
#345
anjanvb
opened
3 months ago
0
GH issue #343: Added key check
#344
dzmitry-kankalovich
opened
3 months ago
1
KeyError: 'Text' - on documents with tables
#343
dzmitry-kankalovich
opened
3 months ago
1
Set JPEG compression parameters
#342
Belval
closed
3 months ago
0
JPEG conversion in `analyze_document` significantly impacts table predictions
#341
Belval
opened
3 months ago
1
Handle None bounding box when parsing Queries
#340
Belval
closed
3 months ago
0
Handle null EntityTypes
#339
Belval
closed
3 months ago
0
Textractor import error
#338
umaaaaaaaaa
closed
3 months ago
1
Large PDF response processing is slow
#337
Belval
opened
3 months ago
0
Proper way of getting cell content?
#336
ttruong-gilead
opened
3 months ago
5
Parsing response from a start_document_analysis()
#335
ttruong-gilead
closed
3 months ago
2
Add missing entities in docs
#334
Belval
closed
3 months ago
0
Error in get_layout_text_from_json in textractprettyprinter
#333
gwynethguo
opened
3 months ago
0
Add CITATION.cff
#332
Belval
closed
3 months ago
0
Next