vmware / pyvmomi

VMware vSphere API Python Bindings
Apache License 2.0
2.2k stars 766 forks source link

multithreaded access to properties is slower than serial access #1084

Open veber-alex opened 1 month ago

veber-alex commented 1 month ago

Describe the bug

I noticed that accessing host properties from multiple threads is slower than doing so serialy. I wrote a script to reproduce the issue:

# ruff: noqa

import ssl
from threading import Thread
import time

from pyVim.connect import SmartConnect
from pyVmomi import vim

NUM_THREADS = 8
HOST = ""
PASSWORD = ""

context = ssl._create_unverified_context()
con = SmartConnect(host=HOST, pwd=PASSWORD, sslContext=context)

host = con.content.viewManager.CreateContainerView(con.content.rootFolder, [vim.HostSystem], True).view[0]

threads = []
for i in range(NUM_THREADS):

    def print_driver(i):
        print(host.config.network.pnic[i].driver)

    t = Thread(target=print_driver, args=(i,))
    threads.append(t)

start = time.time()
for t in threads:
    t.start()
for t in threads:
    t.join()
end = time.time()
print(f"multi threaded: {end - start}")

start = time.time()
for i in range(NUM_THREADS):
    print(host.config.network.pnic[i].driver)
end = time.time()
print(f"single threaded: {end - start}")

On my host with 8 vmnics I get:

multi threaded: 11.908450603485107
single threaded: 3.76969313621521

The single threaded performance is stable around 4 seconds but the multithreaded performance jumps around between 6-12 seconds each run. The script can be changed to always access pnic[0] with the same result. The more threads run at the same time, the slower it gets.

Reproduction steps

  1. set NUM_THREADS, HOST, PASSWORD
  2. run the repro script

Expected behavior

I expect multithreaded performance to be better or equal to serial performance.

Additional context

No response

veber-alex commented 1 month ago

I did another test where I connect to 2 different hosts. Here is the code:

# ruff: noqa

import ssl
from threading import Thread
import time

from pyVim.connect import SmartConnect
from pyVmomi import vim

NUM_VMNICS = 4
HOST = ""
HOST2 = ""
PASSWORD = ""

context = ssl._create_unverified_context()

start = time.time()
for host in [HOST, HOST2]:
    con = SmartConnect(host=host, pwd=PASSWORD, sslContext=context)
    host = con.content.viewManager.CreateContainerView(con.content.rootFolder, [vim.HostSystem], True).view[0]
    for i in range(NUM_VMNICS + 1):
        print(f"host {host.name} - {host.config.network.pnic[i].driver}")

end = time.time()
print(f"single threaded: {end - start}")

threads = []
for host in [HOST, HOST2]:

    def print_driver(host):
        con = SmartConnect(host=host, pwd=PASSWORD, sslContext=context)
        host = con.content.viewManager.CreateContainerView(con.content.rootFolder, [vim.HostSystem], True).view[0]
        for i in range(NUM_VMNICS + 1):
            print(f"host {host.name} - {host.config.network.pnic[i].driver}")

    t = Thread(target=print_driver, args=(host,))
    threads.append(t)

start = time.time()
for t in threads:
    t.start()
for t in threads:
    t.join()
end = time.time()
print(f"multi threaded: {end - start}")

My results are:

single threaded: 5.33910870552063
multi threaded: 5.063408136367798

This tells me there is a bottleneck in pyvmomi itself and not in the esxi host.

veber-alex commented 1 month ago

I did more tests and it looks like the performance issues are caused by python 3.7. Testing with python 3.11 and 3.12 the performance is much better.

prziborowski commented 1 month ago

Also you could consider using multiprocessing instead of threading, and create a service instance (i.e. SmartConnect) in each of them, so you aren't leveraging the same connection.

veber-alex commented 1 month ago

I decided to reopen the issue after further testing.

While the performance numbers with newer versions of python are better the trend is still the same. Connecting from multiple threads to the same host is slower than using one thread and running the code serialy and when connecting to two different hosts the performance improvement of using two threads is tiny when it theory it should be almost linear with the number of hosts.

Also you could consider using multiprocessing instead of threading, and create a service instance (i.e. SmartConnect) in each of them, so you aren't leveraging the same connection.

Thanks but that's not an option in my codebase.