Closed markus-hentsch closed 5 months ago
Backup sources:
Glance images can be downloaded via openstack image save
or the corresponding API action and then be stored outside of the infrastructure by the user.
OpenStack does not include any dedicated backup mechanisms for Glance images aside from that.
Source: Glance storage backend
Target: Download location
On the contrary, Glance itself acts as a backup target for Ephemeral Storage disks of VMs in Nova or volume in Cinder, see section II.
When openstack server image create
is used on a VM that uses an Ephemeral Storage disk, a full image of the disk is created and stored in Glance. This acts as a full backup of the original disk data.
Note: If the VM also has Cinder volumes attached to it, they will not be included in the Glance image. See the
server image create
details in section III about volumes.
Source: Nova Ephemeral Storage backend
Target: Glance image storage backend
For the openstack server shelve
action, a full disk image is created if there is Ephemeral Storage involved. Any attached Cinder volumes are simply detached while the VM is in the SHELVED_OFFLOADED
state. No volume snapshots are created. If the VM has no Ephemeral Storage, no image is created in Glance.
Source: Nova Ephemeral Storage backend
Target: Glance image storage backend
When openstack volume backup create
is used, an (optionally incremental) backup of the volume data is stored in the backup backend.
Backup backends are Swift, NFS, GlusterFS among others. The backup backend must be configured in Cinder.
Note: Encrypted volumes share the same Barbican key and LUKS encryption with their backups.
Source: Cinder storage backend (e.g. Ceph RBD, LVM, etc.)
Target: Cinder backup storage backend (e.g. NFS, Swift, etc.)
When openstack server image create
is used on a VM that has one or more volumes attached to it, the following happens:
This means that a VM with only volumes attached will result in an image that does only consist of metadata and links to Cinder snapshot references but no actual binary image data in Glance itself!
Note: For volumes this action only creates snapshots which are not considered backups because they reside in the same storage backend as the volumes themselves and aren't full copies.
Source: Cinder storage backend
Target: Cinder storage backend (!)
When openstack image create --volume
is used on a volume, a full image of the binary data of the volume will be created and uploaded to Glance.
Note that this does not work on volumes currently attached to VMs. To avoid having to detach them, a detour using a volume snapshot can be taken as shown below.
openstack volume snapshot create --volume
.openstack volume create --snapshot
.openstack image create --volume
.openstack volume delete
.Warning: Creating a Glance image from an encrypted Cinder volume will store the LUKS-encrypted data blocks in the image. This image is useless without the corresponding encryption key stored in a Barbican secret!
Source: Cinder storage backend
Target: Glance storage backend
Barbican secrets can originate from one of two actions:
* in case of encrypted volumes the key stored in Barbican is crucial for the volume data to be useful. This extends to volume backups created via the Cinder Backup API or Cinder to Glance image API action, since volume data is backed up in encrypted form!
Barbican does not offer any backup mechanisms for secrets.
A secret can be retrieved in plaintext using openstack secret get -p
or the corresponding API endpoint. It is then the user's responsibility to appropriately store and protect it as a backup.
Source: Barbican database
Target: Download location
Backups of encrypted volumes become useless if the encryption key is not backed up as well. When handling encrypted volumes, there are two things to consider:
Before Wallaby, point 1 was problematic for users since the secret reference was not visible to them via the API, so the key could not be identified in Barbican easily.
However, Cinder added the visibility of encryption_key_id
to the volume API in microversion 3.64^1 which is available since Wallaby.
Open question: do Glance images created from encrypted volumes have the key ID reference added to their metadata? I believe this should be the case otherwise they couldn't be restored properly?
Glance images created from encrypted Cinder volumes using openstack image create --volume
will carry raw LUKS-encrypted data blocks in them, meaning the image is effectively encrypted using the original volume's encryption.
This also means that the encryption is still bound to the same encryption key (LUKS passphrase) that is stored in Barbican and referenced by the encryption_key_id
attribute of the source volume.
OpenStack uses a character transformation to convert potentially binary encryption keys to valid ASCII using binascii.hexlify()
^3 before passing them to cryptsetup
as passphrases for LUKS disk encryption.
This means that secrets downloaded from Barbican must pass the same conversion in case the image data (LUKS encrypted) is to be decrypted outside of OpenStack for backup restoration purposes.
I used an extended DevStack environment^1 to answer the open questions above:
Open question: do Glance images created from encrypted volumes have the key ID reference added to their metadata? I believe this should be the case otherwise they couldn't be restored properly?
The secret of the volume is cloned and the clone's ID is then bound to the image and referenced as properties.cinder_encryption_key_id
in the image's metadata:
openstack volume show ...
+------------------+-----------------------------------------------------------------------------------------+
| Field | Value |
+------------------+-----------------------------------------------------------------------------------------+
| ... | ... |
| owner | d3c3a86fda9c4190960cbd1e9496ab82 |
| properties | cinder_encryption_key_deletion_policy='on_image_deletion', |
| | cinder_encryption_key_id='55f142f0-4915-47fe-80ed-69b34aa77e7f', hw_rng_model='virtio', |
| | ... |
| ... | ... |
+------------------+-----------------------------------------------------------------------------------------+
This means that secrets downloaded from Barbican must pass the same conversion in case the image data (LUKS encrypted) is to be decrypted outside of OpenStack for backup restoration purposes.
I was able to successfully mimick what OpenStack does with LUKS and the Barbican secret outside of Glance/Cinder using the following procedure:
openstack image save --file image.raw $IMAGE_NAME_OR_ID
openstack image show -f value -c properties $IMAGE_NAME_OR_ID
# (use the value of `cinder_encryption_key_id` as `$SECRET_ID` below)
openstack secret get --file image.key --payload_content_type "application/octet-stream" $SECRET_ID
python3 -c "import binascii; \
f = open('image.key', 'rb'); \
print(binascii.hexlify(f.read()).decode('utf-8'))" \
| sudo cryptsetup luksOpen ./image.raw decrypted_image
The image's contents are now loaded as /dev/mapper/decrypted_image
and can be mounted or snapshotted.
I've integrated the findings of the previous comment about the secret handling into the docs PR.
user data as in data uploaded by the user to the cloud
Should we also add the user data provided via the meta data service to a running instance?
user data as in data uploaded by the user to the cloud
Should we also add the user data provided via the meta data service to a running instance?
Are you referring to the script/configuration data that can be passed as user_data
in Nova's server POST /servers
API request^1?
Your statement makes it sound like something that can be added at runtime but I'm not aware of anything after the creation of the server. It doesn't seem like PUT /servers
allows user_data
to be modified judging from the API docs.
In any case good point, I'll have a look.
FTR, I also inspected the behavior of Cinder Backup and encrypted volumes after adding Swift and Cinder Backup to my DevStack.
When using openstack volume backup create
on a volume that is using a volume type with the LUKS encryption:
encryption_key_id
in the backup's metadataThis is mostly identical to the behavior of openstack image create --volume
in regards to the handling of key and encryption.
user data as in data uploaded by the user to the cloud
Should we also add the user data provided via the meta data service to a running instance?
Are you referring to the script/configuration data that can be passed as
user_data
in Nova's serverPOST /servers
API request1?Your statement makes it sound like something that can be added at runtime but I'm not aware of anything after the creation of the server. It doesn't seem like
PUT /servers
allowsuser_data
to be modified judging from the API docs.In any case good point, I'll have a look.
Footnotes
I was thinking more about what happens if you use the backup for a migration to another cloud or have a DR case where you have to restore everything. In that case it makes sense IMO to backup the user data as well, even if it can't be modified. The user data itself is created by some external tool (e.g. Terraform).
I was thinking more about what happens if you use the backup for a migration to another cloud or have a DR case where you have to restore everything. In that case it makes sense IMO to backup the user data as well, even if it can't be modified. The user data itself is created by some external tool (e.g. Terraform).
I had a look and this is stored in the OS-EXT-SRV-ATTR:user_data
of openstack server show
however it is only visible to admins as the API documentation^1 states:
The user_data the instance was created with. By default, it appears in the response for administrative users only.
I have verified this and indeed the field is shown as empty if the API call is made as a normal user (even if it was the creator of the server) and only shows up when authenticated as admin.
This is one thing we could change at the CSP side of things by creating a standard/decision that CSPs have to adjust their Nova API policy to make this field visible to users in the project.
This is one thing we could change at the CSP side of things by creating a standard/decision that CSPs have to adjust their Nova API policy to make this field visible to users in the project.
It doesn't seem to be that easy because the policy file is not fine-grained enough: the visibility of all OS-EXT-SRV-ATTR:*
metadata attributes are controlled by a single policy rule^1.
This means exposing much more attributes than just the user_data
(including the compute host identity) to the user which is most likely not desired by the CSP.
I don't think we can offer any means of retrieving the originally supplied user_data
to customers with the current Compute API.
While researching and testing proper instructions for handling Cinder volume backups, I stumbled upon some issues related to encrypted volumes.
When the type of the volume, from which a backup is created, is a non-default encrypted type, things get messy. Consider the following scenario:
# create an encrypted (non-default) volume type
openstack volume type create \
--property volume_backend_name='lvmdriver-1' \
--encryption-provider luks \
--encryption-cipher aes-xts-plain64 \
--encryption-key-size 256 \
--encryption-control-location front-end \
lvmdriver-1-LUKS
# create encrypted volume
openstack volume create --size 2 --image "cirros-0.6.2-x86_64-disk" --type lvmdriver-1-LUKS encrypted-volume
# create backup from encrypted volume
openstack volume backup create --name volume-backup encrypted-volume
# restore backup into new volume
openstack volume backup restore volume-backup new-volume-restored-from-backup
# check the status of the volume
openstack volume show new-volume-restored-from-backup -f value -c status
error_restoring
When inspecting the log output of the Cinder Backup service, the following log message can be seen:
cinder.exception.EncryptedBackupOperationFailed: The source volume
type 'a935d148-0d0f-4c25-8459-669f77871c92' is different than the
destination volume type '165c9c5e-477a-4a35-965b-58e504ff4ae3'.
The openstack volume backup restore
command seems to lack a parameter for specifying a volume type. The same goes for the /v3/{project_id}/backups/{backup_id}/restore
API. Thus, it is only possible to restore such a backup by creating an empty and sufficiently sized volume beforehand and then force-restoring onto it:
openstack volume create --size 2 --type lvmdriver-1-LUKS empty-volume
openstack volume backup restore --force volume-backup empty-volume
openstack volume delete empty-volume encrypted-volume
# switch to admin and delete the volume type (DevStack example)
source openrc admin admin
openstack volume type delete lvmdriver-1-LUKS
Now, the volume type that the backup originally was based on has been deleted, rendering the backup unusable because the matching volume type ID cannot be achieved by any new volume.
In summary, this poses the following problems:
openstack volume backup restore
(and the corresponding API) and cannot restore the backup on a new (yet-to-be-created) volume.openstack volume backup show
or the corresponding API. A user cannot determine the correct volume type based on the backup resource alone. Only the Cinder Backup service log file contains the mismatching type IDs as shown in the quoted error message above.Here is the code part in Cinder Backup that compares the volume type IDs:
I reported the issues identified in https://github.com/SovereignCloudStack/standards/issues/541#issuecomment-2056574330 as https://bugs.launchpad.net/cinder/+bug/2061458 upstream.
For the CSP side of things, I drafted a standard at #567 to make Cinder volume backup mandatory. This ties in with what I wrote down in the user guide docs PR in regards to how to use the functionality.
Beyond that I don't feel confident to create any other CSP-side standards on the IaaS user data backup topic and I think #527 is a better place for a holistic approach to CSP-side backups in general.
Combining the availability of the volume backup functionality as per #567 and the user guide of SovereignCloudStack/docs#176 should give users the basic tools needed to create backups of their IaaS resources' data if necessary.
As a CSP I want to know where user data[^1] is aggregated, how it can be backed up and which standards SCS establishes in regards to those backups.
As a customer I want clear documentation and guidelines on how to backup my user data[^1] using native OpenStack mechanisms or alternatives compatible with SCS clouds.
[^1]: user data as in data uploaded by the user to the cloud and data generated by the user in the cloud at runtime (e.g. VM disk filesystems). This excludes network traffic, RAM contents and IDM data in Keystone as well as cloud resource configuration data (VM, volume, network metadata etc.).
Definition of Done: