Closed chrisroos closed 6 years ago
@andrewgarner has just run rake asset_manager:migrate_assets[uploaded/number10]
in production. It's queued up 3186 files.
I suspect these jobs might need to wait until the uploads in https://github.com/alphagov/asset-manager/issues/404 have finished.
I've asked 2ndline to compare the number of these assets on the filesystem to the number that have been created in the Asset Manager database.
@h-lame ran the following commands in production to compare the assets on the filesystem to those in the Asset Manager database:
# Number 10 assets
$ find /data/uploads/whitehall/clean/uploaded/number10 -type f | wc -l
3186
$ govuk_app_console asset-manager
Loading production environment (Rails 5.1.4)
irb(main):001:0> WhitehallAsset.where(legacy_url_path: %r(/government/uploads/uploaded/number10/)).count
=> 3186
irb(main):002:0> WhitehallAsset.deleted.where(legacy_url_path: %r(/government/uploads/uploaded/number10/)).count
=> 0
The number of assets in the database matches the number on the filesystem so we're all good to open a PR to update the nginx config to serve these assets from asset-manager.
I've opened https://github.com/alphagov/govuk-puppet/pull/7130 to update the nginx config to start serving these assets from Asset Manager.
For reference, I requested the example asset in the description to confirm that it's being served by Whitehall in production:
$ curl -v "https://assets.publishing.service.gov.uk/government/uploads/uploaded/number10/image_001.tiles/preview.jpg?CJR$RANDOM" > /dev/null
> GET /government/uploads/uploaded/number10/image_001.tiles/preview.jpg?CJR32305 HTTP/1.1
> Host: assets.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: image/jpeg
< Last-Modified: Tue, 26 Mar 2013 17:20:09 GMT
< ETag: "5151d8c9-103a2"
< Expires: Wed, 24 Jan 2018 01:02:31 GMT
< Cache-Control: max-age=43200, public
< Strict-Transport-Security: max-age=31536000
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: origin, authorization
< Fastly-Backend-Name: origin
< Content-Length: 66466
< Accept-Ranges: bytes
< Date: Tue, 23 Jan 2018 13:02:31 GMT
< Via: 1.1 varnish
< Age: 0
< Connection: keep-alive
< X-Served-By: cache-lhr6343-LHR
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1516712551.003034,VS0,VE65
# Kibana search results for CJR32305
January 23rd 2018, 13:02:31.000 - - whitehall-admin.publishing.service.gov.uk-json.event.access
January 23rd 2018, 13:02:31.000 - - whitehall-admin.publishing.service.gov.uk-json.event.access
January 23rd 2018, 13:02:31.000 - - assets-origin.publishing.service.gov.uk-json.event.access
January 23rd 2018, 13:02:31.000 - - whitehall-frontend.publishing.service.gov.uk-json.event.access
January 23rd 2018, 13:02:31.000 - - whitehall-frontend.publishing.service.gov.uk-json.event.access
I've tested the effect of this PR in integration and used Kibana to confirm that these assets are now being served by Asset Manager.
Note. We don't currently have a realistic set of assets or asset-manager data in integration so I've had to create a Whitehall asset to mirror the example asset in the description.
# Create asset
$ export BEARER_TOKEN=`cat /etc/govuk/manuals-publisher/env.d/ASSET_MANAGER_BEARER_TOKEN`
$ echo `date` > tmp.txt
$ curl \
-H"Authorization: Bearer $BEARER_TOKEN" \
-H"Accept: application/json" \
https://asset-manager.integration.govuk-internal.digital/whitehall_assets \
--form "asset[file]=@tmp.txt" \
--form "asset[legacy_url_path]=/government/uploads/uploaded/number10/image_001.tiles/preview.jpg"
# Request the asset in integration
$ curl -v "https://assets-origin.integration.publishing.service.gov.uk/government/uploads/uploaded/number10/image_001.tiles/preview.jpg?CJR$RANDOM" > /dev/null
> GET /government/uploads/uploaded/number10/image_001.tiles/preview.jpg?CJR9100 HTTP/2
> Host: assets-origin.integration.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200
< date: Tue, 23 Jan 2018 14:54:42 GMT
< content-type: text/plain
< content-length: 29
< server: nginx
< vary: Accept-Encoding
< accept-ranges: bytes
< cache-control: max-age=14400, public
< content-disposition: inline; filename="tmp.txt"
< etag: "5a674c8e-1d"
< last-modified: Tue, 23 Jan 2018 14:54:06 GMT
< strict-transport-security: max-age=31536000
< vary: Accept-Encoding
< vary: Accept-Encoding
< x-frame-options: SAMEORIGIN
< access-control-allow-origin: *
< access-control-allow-methods: GET, OPTIONS
< access-control-allow-headers: origin, authorization
# Search Kibana for CJR9100
January 23rd 2018, 14:54:42.193 - - asset-manager
January 23rd 2018, 14:54:42.000 - - asset-manager-json.event.access
January 23rd 2018, 14:54:42.000 - - assets-origin-json.event.access
January 23rd 2018, 14:54:42.000 - - static-json.event.access
These assets are now being served in production. I made the following request
$ curl -v "https://assets.publishing.service.gov.uk/government/uploads/uploaded/number10/image_001.tiles/preview.jpg?CRL$RANDOM" > /dev/null
GET /government/uploads/uploaded/number10/image_001.tiles/preview.jpg?CRL6587 HTTP/1.1
> Host: assets.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: image/jpeg
< Content-Disposition: inline; filename="preview.jpg"
< Cache-Control: max-age=14400, public
< ETag: "5151d8c9-103a2"
< Last-Modified: Tue, 26 Mar 2013 17:20:09 GMT
< X-Frame-Options: SAMEORIGIN
< Strict-Transport-Security: max-age=31536000
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: origin, authorization
< Fastly-Backend-Name: origin
< Content-Length: 66466
< Accept-Ranges: bytes
< Date: Wed, 24 Jan 2018 11:40:33 GMT
< Via: 1.1 varnish
< Age: 0
< Connection: keep-alive
< X-Served-By: cache-lhr6345-LHR
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1516794033.938863,VS0,VE202
And can see in Kibana that the request was eventually served by asset manager:
I've moved the task to delete these assets to #405 so that we can close this issue.
The Number 10 assets are currently served by the
PublicUploadsController
in Whitehall.These are used by the virtual tour on the History of 10 Downing Street page. There appears to be an XML file for each "tour" and these contain links to the assets in the /uploaded/number10 directory on assets.publishing.service.gov.uk (e.g. image_001.xml.erb).
Example number10 asset: https://assets.publishing.service.gov.uk/government/uploads/uploaded/number10/image_001.tiles/preview.jpg
Tasks
asset_manager:migrate_assets[uploaded/number10]
on production