python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
zsh: killed python neofuzztest.py
Code:
import random
import string
from neofuzz import char_ngram_process
def rand_str(length):
characters = string.ascii_letters + string.digits
return "".join(random.choice(characters) for _ in range(length))
names = [
rand_str(8) + " " + rand_str(6) + " " + rand_str(4) + " " + str(i)
for i in range(400_000)
]
print(len(names))
neofuzz_process = char_ngram_process()
neofuzz_process.index(names)
query = "test 3333"
pre_filter = neofuzz_process.extract(query, limit=2000, refine_levenshtein=True)
print(pre_filter[:10])
The blazing fast speed of this lib can only shine if working on large datasets.
hmm interesting... Thanks for taking your time to look into this. Can I get a full error log? I have a feeling this might have something to do with PyNNDescent
Error message:
Code:
The blazing fast speed of this lib can only shine if working on large datasets.