Hi
The HAC-T clustering for 500 K TLSH list took 6 hours, but The paper claimed it took ~ 2hours 10 min for 10 million samples
(HAC-T and Fast Search for Similarity in Security --- chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/viewer.html?pdfurl=https%3A%2F%2Ftlsh.org%2FpapersDir%2FCOINS_2020_camera_ready.pdf&clen=191519&chunk=true )
Please help me how you achieved this faster clustering, Does it support multi threading
Hi The HAC-T clustering for 500 K TLSH list took 6 hours, but The paper claimed it took ~ 2hours 10 min for 10 million samples (HAC-T and Fast Search for Similarity in Security --- chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/viewer.html?pdfurl=https%3A%2F%2Ftlsh.org%2FpapersDir%2FCOINS_2020_camera_ready.pdf&clen=191519&chunk=true )
Please help me how you achieved this faster clustering, Does it support multi threading
My experiment: Data: 500 K tlsh input Command: python hac-t.py -f -o -cdist 90 -showtime 1 -showcl 1
Machine: 16 core 122 GB ram
Python 3.8.8
Thanks