My multi-threaded, GPU-enabled implementation of Density-Based Spatial Clustering of Applications with Noise.
The GPU implementation has achieved >20x speedup against de facto SKLearn; multi-threaded CPU implementation is 40% faster than SKLearn with one thread, and scalable with multiple threads.
C++17; GCC 7.5; CUDA 10.2; Thrust; pthread.
git submodule update --init --recursive
cmake -DCMAKE_BUILD_TYPE=None -Bbuild -H.
cpu-main
AVX=1
to enable AVX;BIT_ADJ=1
if on average, each vertex has more
than |V|/64
number of neighbours.gpu-main
gpu/CMakeLists.txt
, change the architecture code to fit your
hardware.cmake --build build --target cpu-main/gpu-main
cmake -DCMAKE_BUILD_TYPE=Debug -Bbuild -H.
cpu-test
set AVX=1
and BIT_ADJ=1
correspondingly.gpu-test
modify gpu/CMakeLists.txt
correspondingly.cmake --build build --target cpu-test/gpu-test
Existing test data under test_inputs/
.
[optional] Generate an example using the Python script:
python3 generate_dateset.py --n-samples=20000 --cluster-std=3.0
[optional] Cluster the input and visualize it using Sklearn:
python3 dbscan.py --input-name=test_input.txt --eps=0.1 --min-pts=12
./build/bin/cpu-main --input=<path_to_input> --eps=<eps> --min-pts=<P>
.
--print
to see the cluster ids.--num-threads=K
to speed up the processing../build/bin/gpu-main --input=<path_to_input> --eps=<eps> --min-pts=<P>
.
--print
to see the cluster ids.The query parameters are selected to emulate 10% Noises.
test_input_20k.txt
is generated using --cluster-std=2.2 --n-samples=20000
,
with 4 clusters. Should use --eps=0.15 --min-pts=180
to query.
test_input_50k.txt
is generated using --cluster-std=2.2 --n-samples=50000
,
with 20 clusters. Should use --eps=0.09 --min-pts=110
to query.
test_input_100k.txt
is generated using --cluster-std=2.2 --n-samples=100000
,
with 20 clusters. Should use --eps=0.07 --min-pts=100
to query.
test_input_200k.txt
is generated using --cluster-std=2.2 --n-samples=200000
,
with 20 clusters. Should use --eps=0.05 --min-pts=100
to query.
test_input_300k.txt
is generated using --cluster-std=2.2 --n-samples=300000
,
with 20 clusters. Should use --eps=0.044 --min-pts=100
to query.
test_input_400k.txt
is generated using --cluster-std=2.2 --n-samples=400000
,
with 20 clusters. Should use --eps=0.04 --min-pts=105
to query.
test_input_500k.txt
is generated using --cluster-std=2.2 --n-samples=500000
,
with 20 clusters. Should use --eps=0.035 --min-pts=115
to query.
test_input_600k.txt
is generated using --cluster-std=2.2 --n-samples=600000
,
with 20 clusters. Should use --eps=0.035 --min-pts=120
to query.
test_input_700k.txt
is generated using --cluster-std=2.2 --n-samples=600000
,
with 20 clusters. Should use --eps=0.033 --min-pts=122
to query.
test_input_800k.txt
is generated using --cluster-std=2.2 --n-samples=800000
,
with 20 clusters. Should use --eps=0.03 --min-pts=120
to query.