alphagov / asset-manager

Manages uploaded assets (images, PDFs etc.) for applications on GOV.UK
https://docs.publishing.service.gov.uk/apps/asset-manager.html
MIT License
9 stars 9 forks source link

Serve Whitehall's classification featuring images from Asset Manager #401

Closed chrisroos closed 6 years ago

chrisroos commented 6 years ago

This has been extracted from https://github.com/alphagov/asset-manager/issues/215 to make it easier to manage the remaining work. See that issue for lots more information.

Example asset: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/classification_featuring_image_data/file/1/Events-960.jpg

Todo

chrisroos commented 6 years ago

I ran the following commands in integration. We need to run the same commands in production to give us confidence that the Whitehall NFS mount and Asset Manager database are in sync before we switch the config:

$ find /data/uploads/whitehall/clean/system/uploads/classification_featuring_image_data/file -type f | wc -l
15512

> WhitehallAsset.where(legacy_url_path: %r(/government/uploads/system/uploads/classification_featuring_image_data/file/)).count
=> 15512
> WhitehallAsset.deleted.where(legacy_url_path: %r(/government/uploads/system/uploads/classification_featuring_image_data/file/)).count
=> 0

Note that these figures aren't necessarily realistic - it's the commands we're interested in.

chrisroos commented 6 years ago

@gpeng has run the commands above in production.

$ find /data/uploads/whitehall/clean/system/uploads/classification_featuring_image_data/file -type f | wc -l
15512

> WhitehallAsset.where(legacy_url_path: %r(/government/uploads/system/uploads/classification_featuring_image_data/file/)).count
=> 15512
> WhitehallAsset.deleted.where(legacy_url_path: %r(/government/uploads/system/uploads/classification_featuring_image_data/file/)).count
=> 0

The number of files on the filesystem matches the number of assets in the database so we can make the relevant nginx change to serve these from Asset Manager.

chrisroos commented 6 years ago

I've requested the example asset in the description from integration and used Kibana to confirm that it was served by Whitehall.

$ curl -v "https://assets-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/classification_featuring_image_data/file/1/Events-960.jpg?CJR$RANDOM" > /dev/null

> GET /government/uploads/system/uploads/classification_featuring_image_data/file/1/Events-960.jpg?CJR23637 HTTP/2
> Host: assets-origin.integration.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200 
< date: Tue, 16 Jan 2018 12:01:16 GMT
< content-type: image/jpeg
< content-length: 208338
< server: nginx
< accept-ranges: bytes
< cache-control: max-age=14400, public
< content-disposition: inline; filename="Events-960.jpg"
< etag: "576d968d-32dd2"
< last-modified: Fri, 24 Jun 2016 20:22:37 GMT
< x-frame-options: SAMEORIGIN
< access-control-allow-origin: *
< access-control-allow-methods: GET, OPTIONS
< access-control-allow-headers: origin, authorization

# Kibana logs - searching for CJR23637
January 16th 2018, 12:01:17.000  -   -  assets-origin-json.event.access
January 16th 2018, 12:01:16.771  -   -  whitehall
January 16th 2018, 12:01:16.000  -   -  whitehall-frontend-json.event.access
January 16th 2018, 12:01:16.000  -   -  whitehall-admin-json.event.access
chrisroos commented 6 years ago

I've opened https://github.com/alphagov/govuk-puppet/pull/7098 to update the nginx config to serve these assets from Asset Manager.

chrisroos commented 6 years ago

I merged https://github.com/alphagov/govuk-puppet/pull/7098 yesterday.

chrisroos commented 6 years ago

I didn't test the effect of https://github.com/alphagov/govuk-puppet/pull/7098 in integration but it's now been deployed to production and I've confirmed that we're serving these assets from Asset Manager:

$ curl -v "https://assets.publishing.service.gov.uk/government/uploads/system/uploads/classification_featuring_image_data/file/1/Events-960.jpg?CJR$RANDOM" > /dev/null

> GET /government/uploads/system/uploads/classification_featuring_image_data/file/1/Events-960.jpg?CJR24832 HTTP/1.1
> Host: assets.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: image/jpeg
< Content-Disposition: inline; filename="Events-960.jpg"
< Cache-Control: max-age=14400, public
< ETag: "50e2ab3b-32dd2"
< Last-Modified: Tue, 01 Jan 2013 09:24:11 GMT
< X-Frame-Options: SAMEORIGIN
< Strict-Transport-Security: max-age=31536000
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: origin, authorization
< Fastly-Backend-Name: origin
< Content-Length: 208338
< Accept-Ranges: bytes
< Date: Wed, 17 Jan 2018 13:34:21 GMT
< Via: 1.1 varnish
< Age: 0
< Connection: keep-alive
< X-Served-By: cache-lcy19234-LCY
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1516196061.292625,VS0,VE373

# Kibana logs - searching for CJR24832
January 17th 2018, 13:34:21.378  -   -  asset-manager
January 17th 2018, 13:34:21.000  -   -  asset-manager.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:34:21.000  -   -  assets-origin.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:34:21.000  -   -  static.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:34:21.000  -   -  asset-manager.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:34:21.000  -   -  static.publishing.service.gov.uk-json.event.access
chrisroos commented 6 years ago

This is all done so I'm closing this issue.