Open anarcat opened 1 year ago
oh, and for what it's worth, I have a manual procedure for moving VMs around with export/import documented here:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/ganeti/#migrating-a-vm-between-clusters
it's basically, on the source node:
gnt-backup export -n chi-node-01.torproject.org test-01.torproject.org
and on the target node:
rsync -ASHaxX --info=progress2 root@chi-node-01.torproject.org:/var/lib/ganeti/export/test-01.torproject.org/ /var/lib/ganeti/export/test-01.torproject.org/
gnt-backup import -n dal-node-01:dal-node-02 --src-node=dal-node-01 --src-dir=/var/lib/ganeti/export/test-01.torproject.org --no-ip-check --no-name-check --net 0:ip=pool,network=gnt-dal-01 -t drbd --no-wait-for-sync test-01.torproject.net
... so that works, but it's still a manual process, with multiple steps and everything is a little error-prone, with multiple long-running processes punctuated by manual "copy-paste" things which is not ideal for large clusters. (And yes, I could also have my own automation to speed this up myself, but then i'd be rewriting move-instance, wouldn't i? :)
Hi @anarcat,
we have used the tool a lot in all of our Ganeti 2.16 -> 3.0 Upgrades (basically we set up new small clusters with fresh hardware and Ganeti 3.0, moved some instances, re-purposed/re-installed older nodes where possible and added them to the new cluster(s), moved some more instances etc.). We also stumbled upon some bugs and/or missing features which are fixed in master here, here and here. You can use the script standalone without the rest of the tree directly from master to carry out the migrations.
We ended up with the following pre-setup:
/var/lib/ganeti/cluster-domain-secret
and use gnt-cluster redist-conf
to redistribute it on both source and destination clusterssocat
is installed from the base distribution, not from the backports repository (Ganeti 2.16 contains a fix for that so we did not face that issue on newer machines)We then used an Ansible playbook to carry out the actual instance migrations, but that happenly mainly due to easier integration into other internal workflows which are not relevant here. It did the follwoing:
dd
to do a bitwise transfer of all disksOur instance scenario:
link=br604
turns into link=gnt-bridge,vlan=604
) and also statically set the mac adress the interface had on the source cluster (otherwise a new one will be assigned during instance creation on the destination cluster) - this can be done by adding something like --net 0:link=gnt-bridge,vlan=604,mac=aa:bb:cc:dd:ee:ff
to the command-line of move-instance
)I think I remember that while looking at the code of move-instance
we found out that it actually does make some assumptions on the instance configuration which might lead to the NIC-configuration-related-error you stated above.
The above is far from perfect and still yields some issues that should actually be fixed upstream. But it worked well for us with several hundred instances so far. Hope that helps a bit :-)
We have also used the same approach to move instances between Ganeti 3.0 clusters. However, due to an issue with more recent socat
versions, this needs a manual change of the export/import code on the source node :-(
More information can be found in this issue
Oh and I would definitely say (to add something more useful to this issue): I would suggest to a) extend the documentation with more guidance/example commands/pitfalls and b) of course fix the open/known issues, e.g. the setting of the shared secret which clearly is a bug.
I might find some time in the next days to extend the documentation.
wow, that's all extremely useful! that --keep-instance
flag is invaluable, I didn't even realize the move-instance
script was trashing instances on the source cluster, ouch! I guess it makes sense because of the "move" semantic, but still, dang...
the export/import scripts from the debootstrap OS provider are broken/not usable in our scenario ("partition style" full disk images), so we replace them on-the-fly with the ones from gnt-noop which simply use
dd
to do a bitwise transfer of all disks
amazing, I converged over the exact same thing, probably because of the exact same bug, see https://github.com/ganeti/instance-debootstrap/issues/18
because of the change in the instance network configuration (see above) we alter the network parameters (e.g.
link=br604
turns intolink=gnt-bridge,vlan=604
) and also statically set the mac adress the interface had on the source cluster (otherwise a new one will be assigned during instance creation on the destination cluster) - this can be done by adding something like--net 0:link=gnt-bridge,vlan=604,mac=aa:bb:cc:dd:ee:ff
to the command-line ofmove-instance
)
so basically I need to actually allocate a MAC address for each VM I move? ouch?
I was hoping i could just batch-move instances here to quickly evacuate a cluster, individually mapping MAC addresses doesn't sound like a fun time...
I think I remember that while looking at the code of
move-instance
we found out that it actually does make some assumptions on the instance configuration which might lead to the NIC-configuration-related-error you stated above.
okay, that definitely sounds familiar. what's strange with this problem is that the problem occurs whether I pass a --net
argument or not. it seems like there's a builtin default somewhere that conflicts with another default... without a --net
option, i end up with the following nic
configuration in the remote create
job:
nics:
- ip: 38.229.82.23
link: br0
mac: 06:66:38:c4:0c:23
mode: bridged
network: 097c2565-dab9-4a29-9519-b987718ed812
vlan:
what's interesting there is that the ip
there is actually the one from the source cluster. in a sense, it's obviously incorrect as it does, indeed, have both a IP and a network
field, as described, but it's not supplied by the operator. i wonder wth is going on here...
@rbott
I think I remember that while looking at the code of
move-instance
we found out that it actually does make some assumptions on the instance configuration which might lead to the NIC-configuration-related-error you stated above.
i'd really love to hear where you found that code, because what I found was pretty generic, copying data around. i've made #1698 which seems to work as as stopgap measure here.
i do wonder if the right place to do this might not better be somewhere in here:
i just can't figure out what to do with this stuff... it seems like it make sense to inherit it, but we're actually creating garbage here because it's where we create that dict which has both network
and mode
for example...
at least failing here would fail early and facilitate debugging? not sure what the best way forward is here either.
so i have two more PRs here, #1698 and #1697 which fix the problems i've encountered so far. i'm at this error now:
2023-03-13 20:56:10,146: Move1 INFO [Mon Mar 13 20:56:10 2023] - WARNING: export 'export-disk1-2023-03-13_20_55_58-_1zmyfcu' on chi-node-08.torproject.org failed: Exited with status 1
2023-03-13 20:56:10,146: Move1 INFO [Mon Mar 13 20:56:10 2023] Disk 1 failed to send data: Exited with status 1 (recent output: dd: 0 bytes copied, 0.998604 s, 0.0 kB/s\ndd: 0 bytes copied, 6.00403 s, 0.0 kB/s\nsocat: E SSL_connect(): Connection refused)
i think this could be related to:
make sure source Ganeti nodes are able to establish TCP connections on ports > 1024 to destination Ganeti nodes (inter-cluster-migration will always use the primary network of the cluster, not a secondary/alternate network which may be configured for e.g. DRBD stuff)
i have punched holes in the primary nodes, but not all nodes, so this might be what's crashing this...
and then I guess i'll catch up with your #1681... how did you actually work around that one?
because of the change in the instance network configuration (see above) we alter the network parameters (e.g.
link=br604
turns intolink=gnt-bridge,vlan=604
) and also statically set the mac adress the interface had on the source cluster (otherwise a new one will be assigned during instance creation on the destination cluster) - this can be done by adding something like--net 0:link=gnt-bridge,vlan=604,mac=aa:bb:cc:dd:ee:ff
to the command-line ofmove-instance
)so basically I need to actually allocate a MAC address for each VM I move? ouch?
Well yes and no. If you provide a --net
parameter and leave out mac
, it will default to the value of generate
which will cause the destination Ganeti Cluster to role the dices and generate a new mac address. If that does not cause any problems for you, you can completely ignore this. But if it does cause Problems (DHCP reservations, older systems with autogenerated udev rules for ethX names etc.) you might want to retain the original mac address. In our case we simply ask RAPI on the source cluster for the current mac address(es) of the instance and pass it to the --net
parameter of the move-instance
command. In case of our ansible playbook it is a simple extra task. But YMMV, it might not even be required to retain the mac address(es) :-)
I think I remember that while looking at the code of
move-instance
we found out that it actually does make some assumptions on the instance configuration which might lead to the NIC-configuration-related-error you stated above.i'd really love to hear where you found that code, because what I found was pretty generic, copying data around. i've made #1698 which seems to work as as stopgap measure here.
I probably should have looked at the code again before posting assumptions, sorry for that :-) But I think you have found the right spot and #1698 (along with @apoikos annotation/review) should do the trick and solve that issue.
and then I guess i'll catch up with your #1681... how did you actually work around that one?
Well, we took the short (and ugly) route and "hot-patched" this file on the sending node(s):
https://github.com/ganeti/ganeti/blob/114e59fcc9d4a7c82618569f5d6b7389a0f80123/lib/impexpd/__init__.py#L91
...to state verify=0
. As we mainly used move-instance
to migrate from older clusters to 3.0 clusters, we rarely ran into this problem (mostly cases where we actually moved an instance to the wrong destination cluster and had to move it between two 3.0 clusters afterwards). But nevertheless it is actually broken for everyone right now using 3.0 and it needs a proper solution.
On 2023-03-14 04:36:44, Rudolph Bott wrote:
because of the change in the instance network configuration (see above) we alter the network parameters (e.g.
link=br604
turns intolink=gnt-bridge,vlan=604
) and also statically set the mac adress the interface had on the source cluster (otherwise a new one will be assigned during instance creation on the destination cluster) - this can be done by adding something like--net 0:link=gnt-bridge,vlan=604,mac=aa:bb:cc:dd:ee:ff
to the command-line ofmove-instance
)so basically I need to actually allocate a MAC address for each VM I move? ouch?
Well yes and no. If you provide a
--net
parameter and leave outmac
, it will default to the value ofgenerate
which will cause the destination Ganeti Cluster to role the dices and generate a new mac address. If that does not cause any problems for you, you can completely ignore this. But if it does cause Problems (DHCP reservations, older systems with autogenerated udev rules for ethX names etc.) you might want to retain the original mac address. In our case we simply ask RAPI on the source cluster for the current mac address(es) of the instance and pass it to the--net
parameter of themove-instance
command. In case of our ansible playbook it is a simple extra task. But YMMV, it might not even be required to retain the mac address(es) :-)
What I meant is that to override the error, I need to pass a mac=
setting somehow. But yeah, I think it's okay if our MACs get
renumbered. We do have a per-cluster MAC prefix anyway, so it would be
odd for those VMs to be different.
[...]
and then I guess i'll catch up with your #1681... how did you actually work around that one?
Well, we took the short (and ugly) route and "hot-patched" this file on the sending node(s): https://github.com/ganeti/ganeti/blob/114e59fcc9d4a7c82618569f5d6b7389a0f80123/lib/impexpd/__init__.py#L91 ...to state
verify=0
. As we mainly usedmove-instance
to migrate from older clusters to 3.0 clusters, we rarely ran into this problem (mostly cases where we actually moved an instance to the wrong destination cluster and had to move it between two 3.0 clusters afterwards). But nevertheless it is actually broken for everyone right now using 3.0 and it needs a proper solution.
While we're talking about monkeypatching stuff here, I wonder if there's a cleaner way to bypass this than just disabling verification. In our case, this is flying over an untrusted network, so I actually really don't want to disable verification. I think. Maybe there's a way to hardcode the CA or something?
okay, so i think this ticket can remain for documentation, i filed #1697 for the python 3 stuff, #1698 for the NIC stuff (which could also be improved) and #1699 for the commonname stuff.
what would remain here is documenting the heck out of all this.
I'm trying to migrate between two Ganeti clusters. I have found with great anticipation the move-instance command, but I'm having a hard time making it work.
At first, it would just crash with a backtrace in Debian bullseye:
That's due to this code:
https://github.com/ganeti/ganeti/blob/114e59fcc9d4a7c82618569f5d6b7389a0f80123/tools/move-instance#L941
If I pass
--opportunistic-tries=1
it tells me:So, basically, right now, you must use:
According to @apoikos (on IRC), the
TypeError
is a python2-to-3 leftover...The next problem I had with
move-instance
was aganeti.rapi.client.Error: Password not specified
, but that was me failing at setting up the RAPI users. I also gotganeti.rapi.client.GanetiApiError: 401 Unauthorized: No permission -- see authorization schemes
on the destination cluster. Maybe the docs could be improved to lead the operator the right way ("check your RAPI users again") in the documentation. Having a way to test the users out of band (say withcurl
) would also be useful here.Then I had another error which was pretty opaque:
So that might seem obvious but I did copy the secret over and ran:
So it seems the bug there is that the
--cluster-domain-secret=
argument actually fails to replace the secret on the cluster. I had to manually copy thecluster-domain-secret
file in/var/lib/ganeti
and restart the server for that to work.But what completely blocked me is this:
It looks like the source node is encoding NIC information in the backup and the target node is somewhat unhappy with it. I'm not sure how to debug this: I'm lost in the stack between the client and server method definitions and I don't actually understand what's going on so much.
Did anyone get that thing to work at all? What am I doing wrong?
Should I open separate issues for those things?