Incorrect NUMA Node and CPU Pinning During VM Migration

/!\ To report a security issue please follow this procedure: [https://github.com/OpenNebula/one/wiki/Vulnerability-Management-Process]

Description The current implementation for Huge Pages support, as per the enhancement "Support use of huge pages without CPU pinning #6185," selects a NUMA node based on free resources. The scheduling mechanism effectively balances load across NUMA nodes. However, issues arise during VM migration, leading to inconsistencies.

To Reproduce

Configure a VM to use Huge Pages and deploy it on a host.
Initiate a migration using the standard SAVE/Restore or Live migration method.
Observe that the VM continues to use the old NUMA node on the target host, even if the scheduler selects a different NUMA node based on the target host’s free resources.
If there is insufficient memory in the old NUMA node on the target, the migration may fail.
Deploy new VMs and note inconsistencies caused by incorrectly pinned VMs.

Expected behavior

When a VM is migrated using SAVE/Restore or Live migration methods, the NUMA node assignments should be updated based on the scheduler's decision.
The migration should update the VM’s configuration with the correct NUMA assignments, avoiding failures and maintaining scheduling consistency.

Details

Affected Component: Scheduler, Virtual Machine Manager (VMM)
Hypervisor: KVM
Version: All

Additional context

During SAVE and Live migration operations, we can use the --xml option to provide a new XML configuration file with the updated NUMA topology and CPU pinning information. This ensures that the VM's NUMA node and CPU assignments are correctly updated on the target host.

Progress Status

[ ] Code committed
[ ] Testing - QA
[ ] Documentation (Release notes - resolved issues, compatibility, known issues)

OpenNebula / one

Incorrect NUMA Node and CPU Pinning During VM Migration #6772

Progress Status