Add S3 storage strategy

chavan-arvind commented 2 weeks ago

Related to #15

Add S3 storage strategy for output files.

New S3OCRStrategy Class: Add S3OCRStrategy class in app/ocr_strategies/s3.py implementing the OCRStrategy interface using boto3 to interact with AWS S3.
Update OCR Strategies: Import S3OCRStrategy in app/tasks.py and add s3 strategy to OCR_STRATEGIES dictionary.
Add AWS Configuration: Add AWS S3 configuration variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_S3_BUCKET_NAME, AWS_REGION) in .env.example.
Add Dependency: Add boto3 dependency in app/requirements.txt.

pkarw commented 2 weeks ago

Hey @chavan-arvind! Thanks for this PR isn't it a duplicate of #19?

pkarw commented 2 weeks ago

I think you should review #10 as it's definitely not a right way to define the storage strategy as ocr strategy. Please check PR #10 and try to apply the s3 strategy accordingly based on it ok?

I cannot accept this PR as it is right now due the architecture concerns

CatchTheTornado / pdf-extract-api

Add S3 storage strategy #18