xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.
https://xcp-ng.org
1.32k stars 74 forks source link

Live migration shrinks VM resources, causing MySQL to crash #268

Open elialum opened 5 years ago

elialum commented 5 years ago

During a successful live migration inside the pool (different host same SR), VM resources such as Memory & CPU will LIVE shrink to very minimal & basic. You can actually see them shrink as you go if you run "top" command during the migration.

Although it seems as if the migration was successful, applications such as MySQL that are more "sensitive" to these changes will crash. In some cases MySQL refused to start with a "connection refused" error, the only way out was a full VM reboot.

xen-tools was installed with the latest version (I also tried without xen-tools, same behavior).

XCP Version used -

xen-hypervisor-4.7.6-6.5.1.xcpng.x86_64
xen-tools-4.7.6-6.5.1.xcpng.x86_64
xenopsd-xenlight-0.66.0-1.1.xcpng.x86_64
xcp-ng-deps-7.6.0-5.noarch
vhd-tool-0.27.0-1.1.xcp.el7.centos.x86_64
xcp-ng-plymouth-theme-1.0.0-3.noarch
xcp-ng-release-config-7.6.0-3.x86_64
xcp-ng-pv-tools-7.32.0-2.1.xcp.noarch
xcp-ng-release-7.6.0-3.x86_64
message-switch-1.12.0-5.1.xcp.el7.centos.x86_64
openvswitch-2.5.3-2.2.3.3.xcpng.x86_64
xcp-emu-manager-1.1.2-1.xcpng.x86_64
xen-libs-4.7.6-6.5.1.xcpng.x86_64
xapi-tests-1.110.1-1.7.xcpng.x86_64
xen-dom0-tools-4.7.6-6.5.1.xcpng.x86_64
xenopsd-xc-0.66.0-1.1.xcpng.x86_64
xcp-ng-xapi-plugins-1.4.0-1.xcpng.noarch
xcp-ng-center-7.6.3.21-1.noarch
xcp-featured-1.1.1-2.el7.centos.x86_64
xapi-core-1.110.1-1.7.xcpng.x86_64
xsconsole-10.1.7-1.2.xcp.x86_64
xcp-rrdd-1.9.0-4.el7.centos.x86_64
xcp-ng-generic-lib-1.1.1-1.xcpng.x86_64
xen-dom0-libs-4.7.6-6.5.1.xcpng.x86_64
xenopsd-0.66.0-1.1.xcpng.x86_64
microcode_ctl-2.1-26.xs5.1.xcpng.x86_64
xapi-xe-1.110.1-1.7.xcpng.x86_64
xcp-networkd-0.34.0-3.el7.centos.x86_64
xenserver-firstboot-1.0.9-1.2.xcp.noarch
xcp-python-libs-2.0.5-1.1.xcp.noarch
stormi commented 5 years ago

Hi. You can avoid this by setting an appropriate dynamic min value for your VM's RAM.

olivierlambert commented 5 years ago

This is a normal behavior of the platform, I think by default since XenServer 7.0. By default, prior any migration, Xen will reduce guest memory to dynamic min RAM, then starts migrating.

Double check your dynamic min is high enough for your guest needs.

elialum commented 5 years ago

Guys,

Wow... I was not aware of that. I checked with the affected VMs and... YES :) I can confirm that dynamic min RAM was lower than higher RAM.

Beside the migration, will XenServer also reduce Memory / CPU during runtime? If for example the physical host is overloaded, XenServer will move resources from one vm to another? (live, that is).

Thank you for pointing that out, We were not aware of this feature BTW.

As we are working with XOA templates, it's easier for to clone the template, then simply resize the memory through XOA's VM "general" page (easier and quicker), this will automatically set the first value as the lowest dynamic point, and the new value (given that it's higher) to the higher dynamic point. Maybe a good idea to add a checkbox or something to make it fixed if needed.

olivierlambert commented 5 years ago

To be fair, I discovered this behavior myself out of the blue by having crashed during live migration back in time…

Indeed, we might create a warning in XO UI, when you migrate the VM with a dynamic min < current memory, telling people Xen will reduce RAM to this value and let users change that before actually make the migration. What do you think?

elialum commented 5 years ago

I think that you shouldn't allow dynamic changes during migrations. It already crashed a handful of servers on our side, and probably for others, so there is doesn't make sense for live migrations.

olivierlambert commented 5 years ago

It makes sense from Xen perspective (that's why they introduce it in the first place): you only transfer the minimal amount of RAM.

So I don't want to change the behavior on XCP-ng side, but warning users is also solution.

Ultra2D commented 5 years ago

I think I have mentioned this before, but I want to be able to complete disable this dynamic behaviour on a pool, as in don't allow any VM's to have a lower dynamic min. Might be added as an additional setting in XOA.

olivierlambert commented 5 years ago

You can already do that. In the Advanced VM tab settings, you can edit all values (dynamic min and max), set the same value and that's it.

Ultra2D commented 5 years ago

I know, but I would want this to be enforced for an entire pool, so people can't make a mistake when setting up a VM.

olivierlambert commented 5 years ago

I'm not sure that's even possible in XAPI.

edit: obviously, if you know how (which XAPI call), feel free to tell us and make a small spec so we can implement it :)

Ultra2D commented 5 years ago

Haha, I don't have a clue, sorry. I suggested XOA, because that's the interface used to manage the pools in most cases anyway. In my limited understanding it should be possible (but probably not feasible) to do this? When creating a VM from XOA, disable/block the Dynamic-fields and set them to Static memory max value?

elialum commented 5 years ago

As we are creating/moving VMs all the time, it's indeed hard to control this feature. We did a daily cron that loops through the pools using XOA API, identifies this situtation inside the VMs and changes params accordingly.

elialum commented 5 years ago

@Ultra2D Here is a very basic example, just to get the concept -

$VMS =$xoa->command("xo-cli --list-objects type=VM");
if($VM['memory']['dynamic'][0] < $VM['memory']['dynamic'][1]) {
$xoa->command("xo-cli vm.set id=".$VM['uuid']." memoryMin=".$VM['memory']['dynamic'][1]."");
}
olivierlambert commented 5 years ago

Yeah, so there's no parameter to tell the pool to not use dynamic memory globally. So we'll have to rely on a "hack" or an extra value we decided to use in some other_config field, get this value and always synchronize dynamic min and max together. That's far from perfect but I'd like to have more a function spec guys, eg:

  1. Do you want to disable it poolwide? or on your XO user configuration?
  2. Do you want to have exceptions? (VMs that can use different dynamic min/max) how we should handle this in the UI?
  3. There's probably other questions like that I don't have in mind

So before doing an ugly hack, I'd like to have a proper spec (not technical: functional, ie how do you imagine this from the UI perspective)

Ultra2D commented 5 years ago

For me I would want a setting that applies to all pools/hosts/whatever managed by XOA. Just set it once in XOA and forget. If it's easier to set it per pool on the settings/servers page, I wouldn't mind. It would change the default behaviour when creating a VM to use static memory: when changing the RAM field you change Dynamic memory min, Dynamic memory max, Static memory max .

Next, you indeed might or might not want some or all users to be able to override this (later) per VM. In my case it is not needed at all, so block all changes in advanced settings when creating and editing a VM. If someone changes it another way, so be it I guess, but other people might think differently about that.

So two settings: default to static memory (yes/no), and allow exceptions (yes/no). These could be conflated into one setting with three options (no/default only/always). But I realize my use case is quite limited. I'm not sure if others want to actually enforce it (so revert changes made outside of XOA), and have fine-grained permissions.

olivierlambert commented 5 years ago

That's a bit complicated to implement (would probably require a lot of spaghetti code). We'll try to find a way that's easy and flexible at once.

edit: please create the issue on XO repo.

Ultra2D commented 5 years ago

Well, then don't ask me how I imagine it ;) . BTW, I always liked the dialog in XenCenter. I used "Set a fixed memory of" all the time.

https://github.com/vatesfr/xen-orchestra/issues/2481 could be used? Or should I (or @elialum) create a new one?

olivierlambert commented 5 years ago

I prefer to know exactly what you want, even if it's too complicated. It helps a lot to find a balance between what you want and what we can reasonably do ;)

IDK regarding the other issue, that's more a question for XO devs.

borzel commented 5 years ago

I would like to disable the use of dynamic memory globaly (per pool?), so nobody can use it unintentionally. --> Everytime we migrate a long-running VM which has dynamic memory (somone did not use static on creation :-/ ) it crashes (mostly dev database servers).