Introduces the project of building an image processing pipeline that reads image URLs from a CSV file, downloads and processes the images, and uploads them to cloud storage.
2. Learning Objectives
Lists the key learning goals of the project, including defining batch processing, using Go to run existing software, designing a CLI application, and using cloud storage technology.
3. Project Setup
Explains how to set up and run the provided scaffolding code using Docker and Make.
4. Specification
Outlines the requirements for the CLI tool, including reading input CSV, downloading and processing images, uploading to AWS S3, and writing output CSV.
5. How-to
Provides guidance on implementing various aspects of the project, such as reading CSV files, downloading files, image processing, and interacting with AWS S3.
6. Extensions
Suggests additional features and enhancements to the project, like handling failures, avoiding duplicate downloads and uploads, and parallel processing using goroutines.
Key Things To Learn:
Batch processing and how it differs from building servers
Configuring a Makefile to run a Docker image
Using Go to run existing software (ImageMagick) to complete tasks
Designing and building a CLI application to batch process images
Reading from and uploading data to cloud storage technology (AWS S3)
Reading, modifying, and extending existing code
Using the encoding/csv package to read and write CSV files
Downloading files using the http package and handling potential errors and non-200 status codes
Creating and configuring an AWS S3 bucket for storing processed images
Setting up IAM roles and policies to allow access to the S3 bucket
Using the AWS SDK for Go to interact with Amazon S3
Configuring AWS credentials and passing them to the Docker container
Implementing retry-with-backoff strategy for handling temporary failures in downloading and uploading images
Avoiding re-uploading and re-processing the same images without using a database
Using goroutines to process and upload images in parallel for improved performance
Batch Processing
https://systems.codeyourfuture.io/projects/batch-processing/
Sections
1. Introduction
2. Learning Objectives
3. Project Setup
4. Specification
5. How-to
6. Extensions
Key Things To Learn:
encoding/csv
package to read and write CSV fileshttp
package and handling potential errors and non-200 status codes