bryansteiner / gpu-passthrough-tutorial

GNU General Public License v3.0
1.4k stars 91 forks source link

MEMORY value in the hugepages hook, and what to do on failure? #11

Closed wmarler closed 4 years ago

wmarler commented 4 years ago

Hey Bryan

Absolutely terrific writeup. Very well done.

I have a couple of questions:

  1. Parsing through your steps, you have these 2 lines in alloc_hugepages.sh:

    ## Calculate number of hugepages to allocate from memory (in MB)
    HUGEPAGES="$(($MEMORY/$(($(grep Hugepagesize /proc/meminfo | awk '{print $2}')/1024))))"

    Where does the $MEMORY value come from? Is that an environment variable that is available to the qemu/libvirt/whatever process that is running the hook script, or is that something that should be defined by the user in the kvm.conf?

  2. What should a user do, if the host fails to successfully allocate the hugepages on VM start? I was experimenting with this script by itself by assigning a value to MEMORY just prior to the assignment HUGEPAGES=..., found that if I set MEMORY too high, then the hugepages couldn't get allocated (presumably within 1,000 tries):

    10:38:01 root /etc/libvirt/hooks/qemu.d/Win10Full% prepare/begin/alloc_hugepages.sh
    Allocating hugepages...
    $HUGEPAGES == 8192
    Succesfully allocated 4067 / 8192
    Succesfully allocated 4096 / 8192
    Succesfully allocated 4100 / 8192
    ...
    Succesfully allocated 6479 / 8192
    Succesfully allocated 6479 / 8192
    Succesfully allocated 6481 / 8192
    Succesfully allocated 6481 / 8192
    Not able to allocate all hugepages. Reverting...

    (The extra '$HUGEPAGES == 8192' there is a debug string I added for testing)

bryansteiner commented 4 years ago

Where does the $MEMORY value come from?

The first line of code in every hook script loads kvm.conf which contains the value for MEMORY:

## Load the config file
source "/etc/libvirt/hooks/kvm.conf"

What should a user do, if the host fails to successfully allocate the hugepages on VM start?

You might not have enough RAM available to allocate the number of hugepages you're attempting. In my case, each hugepage has a size of 2048 kB and I'm allocating 8192 hugepages so I'm reserving 16GB (out of 32 GB) of RAM. It seems like you don't have enough memory so decrease the number of hugepages by lowering the value for MEMORY.

$ grep MemTotal /proc/meminfo
MemTotal: 132151496 kB
$ grep Hugepagesize /proc/meminfo
Hugepagesize:       2048 kB
wmarler commented 4 years ago

The first line of code in every hook script > loads kvm.conf which contains the value for MEMORY

I thought as much. However, in your write-up you don't say, when you define kvm.conf, that the variable MEMORY needs to be declared there. Your definition of kvm.conf just has the pci addresses of devices to pass through.

You might also want to give guidance on how this value is to be arrived at (since you're so thorough about everything else!)

I wasn't sure if MEMORY was an environment variable defined by / used by the qemu process that invoked the hook script, so I was hesitant to put it in the script itself, or in the kvm.conf.

Thanks,

On Tue, Jul 7, 2020, 10:22 PM Bryan Steiner notifications@github.com wrote:

Where does the $MEMORY value come from? The first line of code in every hook script loads kvm.conf which contains the value for MEMORY:

Load the config file

source "/etc/libvirt/hooks/kvm.conf"

You might not have enough RAM available to allocate the number of hugepages you're attempting. In my case, each hugepage has a size of 2048 kB and I'm allocating 8192 hugepages so I'm reserving 16GB of RAM. It seems like you don't have enough memory.

$ grep Hugepagesize /proc/meminfo Hugepagesize: 2048 kB

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bryansteiner/gpu-passthrough-tutorial/issues/11#issuecomment-655274709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATXAERXCMDIWMU6SQPJAXDR2PYB3ANCNFSM4OTBTVUQ .

bryansteiner commented 4 years ago

Your definition of kvm.conf just has the pci addresses of devices to passthrough.

Not sure if we're looking at the same file but there's more defined than just pci bus addresses.

You might also want to give guidance on how this value is to be arrived at (since you're so thorough about everything else!)

Appreciate the feedback. There's always going to be some prerequisite level of linux knowledge when attempting these types of tutorials. I figured most people would be able to see the conf file but you're the first to mention it, so I'll consider adding more details.

wmarler commented 4 years ago

Ermahgerd ... I was 100% only looking at the readme.md. I didn't even realize there were files in this repo!

Now I also see ... QEMU options, including HOST_CORES_MASK=3F03F. I definitely think understanding how one creates that bitmask is a topic worthy of explanation. More difficult to get than setting a MEMORY amount for sure. I'm assuming these variables are for upcoming hooks for dynamic CPU isolation, is that right? I was chasing that myself when someone on reddit mentioned that when he's gaming he's not doing anything on the host anyway so ... He didn't do it. And that made me ponder whether it was worth it for me too. Maybe you're pondering the same?

Anyway, I digress. Thanks for taking the time to read & respond.

Will

On Tue, Jul 7, 2020, 10:46 PM Bryan Steiner notifications@github.com wrote:

Your definition of kvm.conf just has the pci addresses of devices to passthrough.

Not sure if we're looking at the same file https://github.com/bryansteiner/gpu-passthrough-tutorial/blob/master/kvm/kvm.conf but there's more defined than just pci bus addresses.

You might also want to give guidance on how this value is to be arrived at (since you're so thorough about everything else!)

Appreciate the feedback. There's always going to be some prerequisite of linux knowledge when attempting these types of tutorials. Figured most people would be able to see that but you're the first to mention it so I'll consider adding more details.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bryansteiner/gpu-passthrough-tutorial/issues/11#issuecomment-655280730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATXAEQQWW64XMC45ULIHELR2P22LANCNFSM4OTBTVUQ .