simonw / s3-ocr

Tools for running OCR against files stored in S3
Apache License 2.0
115 stars 7 forks source link

Textract needs to run in the same region as the S3 bucket #24

Closed simonw closed 2 years ago

simonw commented 2 years ago

Got this error from s3-ocr start s3-ocr-many-pdfs --all:

botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the StartDocumentTextDetection operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.

https://stackoverflow.com/a/64511389/6083 suggests that the fix is to pass the region here:

client = boto3.client('textract', region_name='us-east-2')

Originally posted by @simonw in https://github.com/simonw/s3-ocr/issues/21#issuecomment-1207322670