Closed charlesakalugwu closed 5 years ago
There was a recent update to the create_blob_from_vm.yml
playbook which tried to workaround the missing accessSas
field of az disk grant-access
https://github.com/openshift/openshift-ansible/commit/19cb7aee31c846ee44905ab3457ad5562c50fa35
Unfortunately, the az version used inside openshift-ansible is determined by when a build for the installer image has run
As a first step, can we pin its version down to a stable one so we stop these kinds of breakages? Long term we should move out of openshift-ansible, it's really a pita.
Just did some manual tests with the 2.0.46
and 2.0.47
versions of the az cli, trying to simulate the workflow encoded in the affected playbook
version 2.0.46
$ az -v
azure-cli (2.0.46)
acr (2.1.5)
acs (2.3.4)
advisor (0.6.0)
ams (0.2.3)
appservice (0.2.4)
backup (1.2.1)
batch (3.4.0)
batchai (0.4.3)
billing (0.2.0)
botservice (0.1.1)
cdn (0.1.1)
cloud (2.1.0)
cognitiveservices (0.2.3)
command-modules-nspkg (2.0.2)
configure (2.0.18)
consumption (0.4.0)
container (0.3.4)
core (2.0.46)
cosmosdb (0.2.1)
dla (0.2.3)
dls (0.1.3)
dms (0.1.1)
eventgrid (0.2.0)
eventhubs (0.2.4)
extension (0.2.1)
feedback (2.1.4)
find (0.2.12)
interactive (0.3.30)
iot (0.3.2)
iotcentral (0.1.2)
keyvault (2.2.3)
lab (0.1.1)
maps (0.3.2)
monitor (0.2.3)
network (2.2.5)
nspkg (3.0.3)
policyinsights (0.1.0)
profile (2.1.1)
rdbms (0.3.2)
redis (0.3.2)
relay (0.1.2)
reservations (0.4.0)
resource (2.1.4)
role (2.1.5)
search (0.1.1)
servicebus (0.2.3)
servicefabric (0.1.3)
signalr (1.0.0)
sql (2.1.4)
storage (2.2.2)
telemetry (1.0.0)
vm (2.2.3)
Python location '/usr/lib64/az/bin/python'
Extensions directory '/home/charles/.azure/cliextensions'
Python (Linux) 2.7.15 (default, Sep 21 2018, 23:26:48)
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)]
Legal docs and information: aka.ms/AzureCliLegal
playbook steps with version 2.0.46
mkdir foo && cd foo
az vm show -g charlesakalugwu-dev -n vm > az.vm.show.json
az storage account keys list -n openshiftimages -g images > az.storage.account.json
az disk grant-access --ids $(cat az.vm.show.json | jq .storageProfile.osDisk.managedDisk.id --raw-output) --duration-in-seconds 60 > az.disk.grant.json
cat az.disk.grant.json
cd .. && rm -rf foo
response to the az disk grant-access
for 2.0.46
:
{
"accessSas": null,
"endTime": "2018-10-10T23:42:17.6474612+00:00",
"name": "d2060e48-2eb2-4e7c-b832-9f9c6f321f03",
"properties": {
"output": {
"accessSAS": "xxxxxx"
}
},
"startTime": "2018-10-10T23:42:17.428687+00:00",
"status": "Succeeded"
}
version 2.0.47
$ az -v
azure-cli (2.0.47)
acr (2.1.6)
acs (2.3.6)
advisor (0.6.0)
ams (0.2.3)
appservice (0.2.5)
backup (1.2.1)
batch (3.4.0)
batchai (0.4.3)
billing (0.2.0)
botservice (0.1.1)
cdn (0.1.1)
cloud (2.1.0)
cognitiveservices (0.2.3)
command-modules-nspkg (2.0.2)
configure (2.0.18)
consumption (0.4.0)
container (0.3.5)
core (2.0.47)
cosmosdb (0.2.1)
dla (0.2.3)
dls (0.1.3)
dms (0.1.1)
eventgrid (0.2.0)
eventhubs (0.3.0)
extension (0.2.2)
feedback (2.1.4)
find (0.2.12)
hdinsight (0.1.0)
interactive (0.3.30)
iot (0.3.3)
iotcentral (0.1.2)
keyvault (2.2.4)
lab (0.1.1)
maps (0.3.2)
monitor (0.2.4)
network (2.2.6)
nspkg (3.0.3)
policyinsights (0.1.0)
profile (2.1.1)
rdbms (0.3.2)
redis (0.3.2)
relay (0.1.2)
reservations (0.4.0)
resource (2.1.4)
role (2.1.7)
search (0.1.1)
servicebus (0.3.0)
servicefabric (0.1.4)
signalr (1.0.0)
sql (2.1.4)
storage (2.2.2)
telemetry (1.0.0)
vm (2.2.4)
Python location '/opt/az/bin/python3'
Extensions directory '/home/charles/.azure/cliextensions'
Python (Linux) 3.6.5 (default, Oct 4 2018, 05:49:33)
[GCC 7.3.0]
Legal docs and information: aka.ms/AzureCliLegal
playbook steps with version 2.0.47
mkdir foo && cd foo
az vm show -g charlesakalugwu-dev -n vm > az.vm.show.json
az storage account keys list -n openshiftimages -g images > az.storage.account.json
az disk grant-access --ids $(cat az.vm.show.json | jq .storageProfile.osDisk.managedDisk.id --raw-output) --duration-in-seconds 60 > az.disk.grant.json
cat az.disk.grant.json
cd .. && rm -rf foo
response for the az disk grant-access
:
{
"accessSas": "xxxxxx"
}
The build node vm ci-operator jobs are failing with the following errors
rhel7 example
artifacts: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/azure-build-node-image-rhel-310/61/
centos7 example
artifacts: http://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/azure-build-node-image-centos-310/1/
The failure arises in the
create_blob_from_vm.yml
file https://github.com/openshift/openshift-ansible/blob/32bd8205276fb3824b3f366f726c88ede5adc259/playbooks/azure/openshift-cluster/tasks/create_blob_from_vm.yml#L26It looks like the previous
az disk grant-access
command in the playbook doesn't contain the expectedsas
json (it should contain aproperties.output.accessSAS
key).It might be worth updating the version of the azure cli used in our ci-operator job step images. The latest version of the cli (
2.0.47
) was released two days ago (9th October) and it has an interesting entry in the changelog at https://docs.microsoft.com/en-us/cli/azure/release-notes-azure-cli?view=azure-cli-latest#vmIts is interesting that the version of the cli installed in our
buildimage
container is2.0.46
. Upgrading to the latest az cli version could fix this issue.If this doesn't work then the only other way the output of the
az disk grant-access
command would be empty is if it is called on a disk that is currently still attached to a running VM. That means we would need to check the playbooks to make sure that the VM is switched off before performingaz disk grant-access