guardian / grid

The Guardian’s image management system
https://www.theguardian.com/info/developer-blog/2015/aug/12/open-sourcing-grid-image-service
Apache License 2.0
1.44k stars 119 forks source link

[image-loader] push -> pull based architecture #4201

Closed twrichards closed 6 months ago

twrichards commented 7 months ago

Co-authored-by: @dblatcher

Original issue neatly described here https://github.com/guardian/grid/issues/4026, see also the follow-up comments as we explored the problem in the lead up to this PR.

Infrastructure changes in https://github.com/guardian/editorial-tools-platform/pull/733

Change summary by area

/common-lib/**/aws/

/common-lib/**/config/

/dev/

/image-loader/

/kahuna/

Config: Kahuna will continue to use the old/current method of posting image files directly to ImageLoader if there is no queue-based ingest bucket defined , otherwise, it will:

  1. generate the mediaId (SHA-1 hash of the file) for each image client side (there is a new dependency, filehash for this)
  2. POST the filenames and media ids of the files to be uploaded to the /prepare endpoint to receive the presigned URL for each image
  3. PUT the file in S3 using the the presigned URL
  4. On receiving an ‘OK’ response from S3, POST to /uploadStatus endpoint to update the image status ‘Queued’
  5. (As before) poll the /uploadStatus endpoint to GET the image status, updating the UI when an image is ‘Completed’ or ‘Failed’

Diagrams

Architecture

reveal mermaid code ```mermaid flowchart TD user(Kahuna) ftp(FTP server) feeds(Grid Feeds) image-loader(image-loader\nec2) ingest[(ingest bucket\ns3)] queue(SQS queue) bucket[(image bucket\ns3)] elastic[(elasticsearch)] status[(upload status\ndynamo)] user <-- request\n presigned URL --> image-loader user -- upload file\n(presigned URL) -->ingest ftp -- upload file--> ingest feeds -- upload file--> ingest ingest -- objectCreated --> queue queue <-- pulls messages --> image-loader image-loader -- move file from\n ingest bucket---> bucket image-loader -- add image ---> elastic image-loader -- update status ---> status ```

UI upload sequence

sequenceDiagram
    participant MediaAPI
    Kahuna->>Kahuna: Generate mediaId (SHA-1)
    Kahuna->>Image Loader: POST prepare (Map of id -> filename)
    Image Loader->>Image Loader: upload status: Prepared
    Image Loader-->> AWS: GeneratePresignedUrlRequest
    AWS-->> Image Loader: GeneratePresignedUrlResponse
    Image Loader->>Kahuna: Map of id -> presigned url
    Kahuna-->>AWS: PUT file in bucket 
    AWS-->>Kahuna: upload confirmed
    Kahuna->>Image Loader: upload status: Queued
    Kahuna-->Image Loader: Poll: uploadStatus
    Kahuna->>Kahuna: Update UI
    AWS-->>Image Loader: SQS message - s3:ObjectCreated
    Image Loader->>Image Loader: Ingest image, upload status: Completed
    Kahuna-->Image Loader: Last Poll: uploadStatus
    Kahuna->>Kahuna: Update UI
    Kahuna-->MediaAPI: Poll: image details
    Kahuna->>Kahuna: Update UI
github-actions[bot] commented 7 months ago

Deploy build 12208 to TEST

All deployment options - [Deploy build 12208 to TEST](https://riffraff.gutools.co.uk/deployment/deployAgain?project=media-service%3A%3Agrid%3A%3Aall&build=12208&stage=TEST&updateStrategy=MostlyHarmless&action=deploy) - [Deploy parts of build 12208 to TEST by previewing it first](https://riffraff.gutools.co.uk/preview/yaml?project=media-service%3A%3Agrid%3A%3Aall&build=12208&stage=TEST&updateStrategy=MostlyHarmless)

From guardian/actions-riff-raff.

prout-bot commented 6 months ago

Seen on image-loader (merged by @twrichards 8 minutes and 51 seconds ago) Please check your changes!

prout-bot commented 6 months ago

Seen on auth, usage, metadata-editor, thrall, leases, cropper, collections, media-api, kahuna (merged by @twrichards 9 minutes and 56 seconds ago) Please check your changes!