anarazel / pg-vm-images

builds VM images for postgres CI testing
Other
4 stars 7 forks source link

Image creation might succeed, even though packer failed, leading to missing image permissions #50

Closed anarazel closed 1 year ago

anarazel commented 1 year ago

E.g. https://cirrus-ci.com/task/6317071085076480?logs=build_image#L447

2023-02-07T21:42:02Z: ==> windows.googlecompute.windows-ci-vs-2019: Creating image...
[21:47:04.615] 2023-02-07T21:47:04Z: ==> windows.googlecompute.windows-ci-vs-2019: Error waiting for image: time out while waiting for image to register
[21:47:04.615] 2023-02-07T21:47:04Z: ==> windows.googlecompute.windows-ci-vs-2019: Deleting disk...
[21:47:04.849] 2023-02-07T21:47:04Z: ==> windows.googlecompute.windows-ci-vs-2019: Error deleting disk. Please delete it manually.
[21:47:04.849] ==> windows.googlecompute.windows-ci-vs-2019: 
[21:47:04.849] ==> windows.googlecompute.windows-ci-vs-2019: DiskName: build-windows-ci-vs-2019-2023-02-07t21-33
[21:47:04.849] ==> windows.googlecompute.windows-ci-vs-2019: Zone: us-west1-a
[21:47:04.849] ==> windows.googlecompute.windows-ci-vs-2019: Error: googleapi: Error 400: The disk resource 'projects/pg-ci-images/zones/us-west1-a/disks/build-windows-ci-vs-2019-2023-02-07t21-33' is already being used by 'projects/pg-ci-images/global/images/pg-ci-windows-ci-vs-2019-2023-02-07t21-33', resourceInUseByAnotherResource
[21:47:04.850] 2023-02-07T21:47:04Z:     windows.googlecompute.windows-ci-vs-2019: Disk has been deleted!
[21:47:04.850] 2023-02-07T21:47:04Z: ==> windows.googlecompute.windows-ci-vs-2019: Provisioning step had errors: Running the cleanup provisioner, if present...
[21:47:04.850] 2023-02-07T21:47:04Z: Build 'windows.googlecompute.windows-ci-vs-2019' errored after 13 minutes 54 seconds: Error waiting for image: time out while waiting for image to register
[21:47:04.852] 
[21:47:04.852] ==> Wait completed after 13 minutes 54 seconds
[21:47:04.852] 
[21:47:04.852] ==> Some builds didn't complete successfully and had errors:
[21:47:04.852] --> windows.googlecompute.windows-ci-vs-2019: Error waiting for image: time out while waiting for image to register
[21:47:04.852] 
[21:47:04.852] ==> Builds finished but no artifacts were created.

Actually succeeded creating the image, which then lead to these failures:

Failed to start an instance: INVALID_ARGUMENT: Forbidden 403 Forbidden POST https://compute.googleapis.com:443/compute/v1/projects/cirrus-ci-community/zones/us-central1-c/instances { "error": { "code": 403, "message": "Required 'compute.images.useReadOnly' permission for 'projects/pg-ci-images/global/images/pg-ci-windows-ci-vs-2019-2023-02-07t21-33'", "errors": [ { "message": "Required 'compute.images.useReadOnly' permission for 'projects/pg-ci-images/global/images/pg-ci-windows-ci-vs-2019-2023-02-07t21-33'", "domain": "global", "reason": "forbidden" } ] } } 

The best option would be to not assign the family inside packer, but do it afterwards, after the permission has been set up. However, there does not appear to be an option to add a family to an image.

Nor have I found a way to set up a default policy for image permissions, which would be even better.

anarazel commented 1 year ago

The best option would be to not assign the family inside packer, but do it afterwards, after the permission has been set up. However, there does not appear to be an option to add a family to an image.

There is. Either there didn't use to, or I somehow misread the language previously.

gcloud compute images update --project <project> --family <family> <image-name>

does work.

@nbyavuz If you have time, it'd be great to build all images without a family grant access, and only then add the family.

A later step would be to test the newly built image and only then add the family.

anarazel commented 1 year ago

There is. Either there didn't use to, or I somehow misread the language previously.

https://cloud.google.com/sdk/docs/release-notes#compute_engine_85

333.0.0 (2021-03-23) ... Promoted --description and --family flags of gcloud compute images update to GA.