Closed seanlaw closed 4 years ago
Of course it is supposed to be much better. Did you by chance use a random string? My code is specially optimized for that.
Do you have a smaller example I can test on? Please include your numpy code.
Also, the memory usage of my code should be linear with a small constant if you use np.unique
and not a dict.
I basically modified your example.py
with
random_string = np.random.randint(100, size=300_000_000, dtype=np.uint8)
So, it's nothing special but it still took close to 2 hours.
The naive NumPy functions (based on your Python implementation) is:
import numpy as np
import time
def to_int_keys_np(l):
indices = np.unique(l, return_inverse=True)[1]
return indices
def suffix_array_np(s):
n = len(s)
k = 1
line = to_int_keys_np(s)
tmp_line = np.ones(n, dtype=np.int64)
while max(line) < n - 1:
tmp_line[:] = -1
tmp_line[:-k] = line[k:]
line[:] = (n + 1) * line + tmp_line + 1
line[:] = to_int_keys_np(line)
k <<= 1
return line
if __name__ == '__main__':
inp = np.random.randint(100, size=300_000_000)
start = time.time()
sa_np = suffix_array_np(inp)
print(time.time() - start)
Yeah, my implementation uses this really nice trick max(line)<n-1
that basically kills the complexity for random strings.
Could you test for a constant string? Try a smaller number of characters, my algorithm will take much more time.
Also, I guess you use a powerful machine with multiple cores. Numpy uses powerful parallelization. You should compile libdivsufsort with openmp activated. To do that, just add the option -DUSE_OPENMP=ON
in the cmake call.
I really cannot reproduce. There is a loop that repeats the computation 100 times in example.py
. Are you sure this is not the cause of your problem?
Well, that's embarrassing! I totally overlooked the for-loop
. Indeed, that was the issue! I'm glad that was it and not something else. One run took about 67 seconds. Thanks and sorry for the false alarm 👍
That's just how performant libdivsufsort is!
@louisabraham Based on your original Python implementation, I played around with a naive NumPy implementation and, for an input array that is
300_000_000
in length, my NumPy implementation takes around 20 minutes and the libdivsufsort version takes 100 minutes.Of course, the memory usage of the libdivsufsort is very small (almost an order of magnitude less than the NumPy version) but I expected the computation time to be much, much better.