ifduyue / python-xxhash

Python Binding for xxHash
https://pypi.org/project/xxhash/
BSD 2-Clause "Simplified" License
366 stars 30 forks source link

Unexpectedly Poor Performance on AMD Ryzen 9 7950X? #61

Open JohnTravolski opened 1 year ago

JohnTravolski commented 1 year ago

I benchmarked a variety of Python operations between an i7 5960x (released late 2014) and an AMD Ryzen 9 7950X (released late 2022). In general, the single threaded performance of the 7950X is typically much higher than the 5960X given the improvements that have been made during that timeframe. However, I noticed that xxhash doesn't see the same speedup. Here's a comparison of a variety of Python functions and the time they took to run on both machines (Python 3.8, xxhash==1.4.3):

image

The 3rd and 4th column are the time in seconds for each machine to run each operation a given number of times, either measured with time.time() or time.perf_counter().

The code for measuring the performance of xxhash was as follows:

import xxhash
import time

hard = 10
datadict = {}

def timer(num_attempts = 10):
  def decor(func):
    def wrapper(*args):
      for timefunc, string in [(time.time, "time()"), (time.perf_counter, "perf_counter()")]:
        times = []
        for attempt_num in range(num_attempts):
          start = timefunc() # start counting time using the given timing function
          result = func(*args) # the code we're benchmarking
          end = timefunc()
          duration = end - start
          times.append(duration)
        average_time = sum(times)/len(times) # take the average of the times over num_attempts
        datadict[func.__name__ + "~" + string] = average_time
      return result
    return wrapper
  return decor

@timer(hard)
def xxhash_4kimg_5ktimes(filebits):
  j = 0
  for i in range(5000):
    j = xxhash.xxh64(filebits).hexdigest()
  return j

print("Timing hashing")
with open('2.jpg', 'rb') as afile:
  filebits = afile.read()
print("\txxhash:")
j = xxhash_4kimg_5ktimes(filebits)

for key, val in datadict:
  print(key, val)

('2.jpg' is a 2.94 MB jpeg file, 3840x2160 resolution). I'm on Windows 10.

Any idea why the speedup isn't higher, as many of the other functions I tested were? On average I was getting at least a 2x speedup for most Python functions, but only got 1.08x for xxhash. It was the only one that performed so poorly.

ifduyue commented 1 year ago

Can you try latest version of python-xxhash and attach the resulsts?

JohnTravolski commented 1 year ago

Can you try latest version of python-xxhash and attach the resulsts?

Good idea. Just retried it with xxhash version 3.2.0 for Python 3.8. Definitely an improvement, but I suppose still not as fast as most other Python functions:

image

A 1.47x speedup is definitely better than the 1.08x I had before, but still lagging behind the average of the others I tested.

Given the single-threaded Cinebench score of the 7950X is over double that of the 5960x, my expectation was closer to a 2x, which is what I got for a lot of other Python functions (as can be seen in my first post).

I'm guessing there's something about the algorithm that is causing it to not scale as well on this processor, but I don't know what it would be.

ifduyue commented 1 year ago

Thanks for the quick feedback, I'll try to find out why

ifduyue commented 1 year ago

Haven't done any tests yet, a wild guess that it is hexdigest code https://github.com/ifduyue/python-xxhash/blob/f8086d4a8156c04b213366ce9a4e2b5f0f330dfe/src/_xxhash.c#L268-L276