alphagov / asset-manager

Manages uploaded assets (images, PDFs etc.) for applications on GOV.UK
https://docs.publishing.service.gov.uk/apps/asset-manager.html
MIT License
9 stars 9 forks source link

Serve Whitehall's people images from Asset Manager #402

Closed chrisroos closed 6 years ago

chrisroos commented 6 years ago

This has been extracted from https://github.com/alphagov/asset-manager/issues/215 to make it easier to manage the remaining work. See that issue for lots more information.

Example asset: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/person/image/1/David_Cameron.jpg

Todo

chrisroos commented 6 years ago

I ran the following commands in integration. We need to run the same commands in production to give us confidence that the Whitehall NFS mount and Asset Manager database are in sync before we switch the config:

$ find /data/uploads/whitehall/clean/system/uploads/person/image -type f | wc -l
25306

> WhitehallAsset.where(legacy_url_path: %r(/government/uploads/system/uploads/person/image/)).count
=> 24172
> WhitehallAsset.deleted.where(legacy_url_path: %r(/government/uploads/system/uploads/person/image/)).count
=> 36

Note that these figures aren't necessarily realistic - it's the commands we're interested in.

chrisroos commented 6 years ago

@gpeng has run the commands above in production.

$ find /data/uploads/whitehall/clean/system/uploads/person/image -type f | wc -l
24193

> WhitehallAsset.where(legacy_url_path: %r(/government/uploads/system/uploads/person/image/)).count
=> 24193
> WhitehallAsset.deleted.where(legacy_url_path: %r(/government/uploads/system/uploads/person/image/)).count
=> 17

The number of files on the filesystem matches the number of assets in the database so we can make the relevant nginx change to serve these from Asset Manager.

chrisroos commented 6 years ago

I've requested the example asset in the description from integration and used Kibana to confirm that it was served by Whitehall.

$ curl -v "https://assets-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/person/image/1/David_Cameron.jpg?CJR$RANDOM" > /dev/null

> GET /government/uploads/system/uploads/person/image/1/David_Cameron.jpg?CJR20634 HTTP/2
> Host: assets-origin.integration.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200 
< date: Tue, 16 Jan 2018 12:02:56 GMT
< content-type: image/jpeg
< content-length: 365691
< server: nginx
< accept-ranges: bytes
< cache-control: max-age=14400, public
< content-disposition: inline; filename="David_Cameron.jpg"
< etag: "57707a3e-5947b"
< last-modified: Mon, 27 Jun 2016 00:58:38 GMT
< x-frame-options: SAMEORIGIN
< access-control-allow-origin: *
< access-control-allow-methods: GET, OPTIONS
< access-control-allow-headers: origin, authorization

# Kibana logs - searching for CJR20634
January 16th 2018, 12:02:57.000  -   -  assets-origin-json.event.access
January 16th 2018, 12:02:56.817  -   -  whitehall
January 16th 2018, 12:02:56.000  -   -  whitehall-frontend-json.event.access
January 16th 2018, 12:02:56.000  -   -  whitehall-admin-json.event.access
chrisroos commented 6 years ago

I've opened https://github.com/alphagov/govuk-puppet/pull/7099 to update the nginx config to start serving people images from Asset Manager.

chrisroos commented 6 years ago

I've merged https://github.com/alphagov/govuk-puppet/pull/7099 so will wait for it to be deployed/applied in integration before checking the effect.

chrisroos commented 6 years ago

I didn't check the effect of https://github.com/alphagov/govuk-puppet/pull/7099 in integration but it's now been deployed to production and I've confirmed that these assets are being served by Asset Manager as expected.

$ curl -v "https://assets.publishing.service.gov.uk/government/uploads/system/uploads/person/image/1/David_Cameron.jpg?CJR$RANDOM" > /dev/null

> GET /government/uploads/system/uploads/person/image/1/David_Cameron.jpg?CJR31506 HTTP/1.1
> Host: assets.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: image/jpeg
< Content-Disposition: inline; filename="David_Cameron.jpg"
< Cache-Control: max-age=14400, public
< ETag: "50a42b62-5947b"
< Last-Modified: Wed, 14 Nov 2012 23:38:10 GMT
< X-Frame-Options: SAMEORIGIN
< Strict-Transport-Security: max-age=31536000
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: origin, authorization
< Fastly-Backend-Name: origin
< Content-Length: 365691
< Accept-Ranges: bytes
< Date: Wed, 17 Jan 2018 13:36:24 GMT
< Via: 1.1 varnish
< Age: 0
< Connection: keep-alive
< X-Served-By: cache-lcy19233-LCY
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1516196185.667310,VS0,VE300

# Kibana logs - searching for CJR31506
January 17th 2018, 13:36:24.710  -   -  asset-manager
January 17th 2018, 13:36:24.000  -   -  asset-manager.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:36:24.000  -   -  asset-manager.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:36:24.000  -   -  assets-origin.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:36:24.000  -   -  static.publishing.service.gov.uk-json.event.access
January 17th 2018, 13:36:24.000  -   -  static.publishing.service.gov.uk-json.event.access
chrisroos commented 6 years ago

This is all done so I'm closing this issue.