OpenNebula / one

The open source Cloud & Edge Computing Platform bringing real freedom to your Enterprise Cloud 🚀
http://opennebula.io
Apache License 2.0
1.22k stars 478 forks source link

Add REMOTE_HOST_PATH attribute for rsync backups datastore #6203

Open Franco-Sparrow opened 1 year ago

Franco-Sparrow commented 1 year ago

Hello Team

Description

In the ON CE/EE 6.6.0 and ON EE 6.6.1 versions, OpenNebula will always try to save to the remote host (backup server) in the /var/lib/one/datastores/<DS_ID> directory and the code will consider that said directory is accessible by the orchestrator (frontend), as part of its own filesystem. In case of the remote host act as a backup server for multiple OpenNebula clusters, in order to prevent the clusters' backups from colliding, each one, is required to make the backups to its own directory. Therefore, it was decided to share by NFS the directory that has the remote host for each of the clusters (in this case ONC1). A symbolic link from the backup datastore directory to the NFS mount point and the problem is solved. In this way, OpenNebula will access the directory to perform backup and restore operations, as if it were on its filesystem. For this solution to work, RSYNC_HOST = <orchestrator_floating_ip> is required in case of a cluster of multi-node and orchestrators, while RSYNC_HOST = <orchestrator_ip> for a single-node cluster. The orchestrator would be accessing the backup datastore directory /var/lib/one/datastores/<DS_ID>, the symbolic link would be redirecting it to the NFS mount point /var/lib/datastores/nfs_mount and this one to the remote backup server, where cluster ONC1 has its own backups directory /var/lib/vault/onc1/<DS_ID>.

In tests carried out with the proposed solution to share access to the cluster's backup directory on the remote server, through NFS, it was possible to appreciate that the resync copies were a bit slow and intermittent. Said limitation was being imposed by transfers through the NFS. As you cannot specify the remote directory for rsync backups, modifications were done so that the transfer would go outside of NFS, and accessing directly to the cluster directory on the remote server, which is not the one defined automatically by OpenNebula (/var/lib/one/datastores/<DS_ID>). With a direct rsync to the real directory for the cluster ONC1 backups, in the remote host, higher and constant speeds were achieved, in relation with those obtained by NFS.

On Remote Host (Backup Server)

Create directory for the backups of the cluster ONC1:

CLIENT=onc1
DS_ID=102
mkdir -p /var/lib/vault/$CLIENT/${DS_ID} && \
chown -R oneadmin:oneadmin /var/lib/vault/$CLIENT

On orchestrator leader

Edit the file /var/lib/one/remotes/datastore/rsync/backup:

VER=6.6.1
sudo -u oneadmin cp /var/lib/one/remotes/datastore/rsync/backup /var/lib/one/remotes/datastore/rsync/backup.orig-$VER && \
sudo -u oneadmin nano /var/lib/one/remotes/datastore/rsync/backup

Press Alt + Shift + 3, to show the line numbers:

Leave the following lines as follow (adapt to your cluster):

# [...]
 96 cluster_name = "onc1"
# [...]
 98 ds_id = "102"
# [...]
101 remote_host_path = "/var/lib/vault/#{cluster_name}/#{ds_id}/#{vmid}/#{backup_id}/"
# [...]
108                                  :host => "#{rsync_user}@192.168.1.254",
109                                  :cmds => "mkdir -p #{remote_host_path}",
# [...]
118 cmd = "rsync #{args} #{vm_dir}/ #{rsync_user}@192.168.1.254:#{remote_host_path}/"

NOTE
The rsyn_host in this case will not be the one defined on the backup datastore attributes. It will be the IP for backup server, not the IP of the orchestrator who can access to the default base_path defined by OpenNebula for every resync Backup datastore.

Copy the modified file to the rest of the orchestrators.

Use case

The rsync Backups on OpenNebula should be prepared for a remote host holding backups of multiple Opennebula clusters, not just one, as imposed by the code right now, bt defining the same base_path for backups (the path to the rsync backups datatores).

Interface Changes Changes should be done on the rsync backup driver.

Additional Context Please feel free to add any other context or screenshots about the feature request here. Or any other alternative you have considered to address this new feature.

Progress Status

nachowork90 commented 1 year ago

As workaround we can create a different id for each cluster, but it is more portable solution choosing a destination PATH.

In this way we can keep dev cluster and the prod one reusing the same Backup Host as backup repository!

sk4zuzu commented 1 year ago

Hi @Franco-Sparrow :)

I think the second part of this statement is not accurate (please correct me if I misunderstood you):

OpenNebula will always try to save to the remote host (backup server) in the /var/lib/one/datastores/ directory and the code will consider that said directory is accessible by the orchestrator (frontend), as part of its own filesystem.

Backup servers do not need this directory to be mounted via shared storage (the only things that are required to be the same are paths). There are 2 cases handled:

  1. Backup server is external (the intended one).
  2. Backup server is the same server as frontend server, in such case the directory is "shared".

OK, back to your feature request... :)

So the real reason for the directory /var/lib/one/datastores/ is named in such a way is consistency. We offer "restic" datastore in the enterprise edition of OpenNebula and there is no good way in restic to change those paths (qcow2 files are taken from the fs and those paths are preserved exactly as keys in restic's database).

So for rsync datastore it is an option to make those paths configurable, however we won't for sure provide any guarantee of consistency when users decide to change that path (if there are backups stored already) :point_up: :relieved: . I think we can do this though with that caveat in mind, WDYT @rsmontero ?

As for workarounds, an another one would be for example preparing nspawn (or LXC whatever) containers for each ON cluster, which is relatively easy + should be much safer :thinking:.

Franco-Sparrow commented 1 year ago

Hi @sk4zuzu

The problem is that, if you have a backup server for multiple OpenNebula clusters, you cant use this path on the remote host for each cluster, because VM backups will collide each others. @nachowork90 @kCyborg and I made a workaround that allow us to define a specific remothe path for each OpenNebula cluster on the backup server, but each directory will be shared through NFS as if it were /var/lib/one/datastores/<DS_ID>/, and DS_ID could be 100 for each cluster and that doesnt mean that backups will collide, because in the backup server that directory is not the same as the other clusters. Take a look at the workwound here (my last comment).

We transformed the rsync driver into an nfs driver for backups :)

Regards and hope 6.6.2 be ready soon!!!

Franco-Sparrow commented 1 year ago

@sk4zuzu your idea is not bad...you could create an LCX container inside your remote backup server, being these LXC containers the remote host for each ON cluster. This will consume one IP from your network for each cluster (a minor problem) and you could share your local storage on the backup server with those LXC containers (one specific folder for each cluster). On this way you dont need to do any modifications on the rsync driver, but still there is the problem that ON will allways backup first into the local storage and them it will rsync into the remote host.

My teammates and I studied this and after the modifications we made here, you will be able to backup directly to the remote host, and this is good, as you will not use the production storage of the ON cluster, with backups, because they will be made directly into the remote host. A requiredment for this is that remote host folder for each ON cluster needs to be shared with the ON node and orchestrator, as they will make actions into these folders as if they be part of their filesystem. Instead of an rsync action, it will be a cp action.