Open tmd78 opened 1 year ago
Did you get it working? Im late seeing it.
Hi! I haven't gotten it to work yet. I'm inspecting what happens at each step of the process to see if I can find the problem.
I'm working with a small dataset:
1,1
2,2
3,3
10,10
11,11
12,12
20,20
21,21
22,22
The algorithm creates an index with 13 bins. Here is the information each bin holds:
bin 1 level: 0 upperBounds: 8.000000
bin 2 level: 0 upperBounds: 15.000000
bin 3 level: 0 upperBounds: 29.000000
bin 4 level: 1 upperBounds: 8.000000
bin 5 level: 1 upperBounds: 15.000000
bin 6 level: 1 upperBounds: 29.000000
bin 7 level: 1 upperBounds: 8.000000
bin 8 level: 1 upperBounds: 15.000000
bin 9 level: 1 upperBounds: 29.000000
bin 10 level: 1 upperBounds: 8.000000
bin 11 level: 1 upperBounds: 15.000000
bin 12 level: 1 upperBounds: 29.000000
The algorithm assigns the points to bins like this:
dataKey: 4 dataValue: 0
dataKey: 4 dataValue: 1
dataKey: 4 dataValue: 2
dataKey: 8 dataValue: 3
dataKey: 8 dataValue: 4
dataKey: 8 dataValue: 5
dataKey: 12 dataValue: 6
dataKey: 12 dataValue: 7
dataKey: 16 dataValue: 8
Does this look correct?
Oh, here's the common.h I got the above results with common.h.txt
Would you be able to provide me your email address?
I quickly checked it. It seems the issue is in the indexing. There hasn't been any distance calculations.
dataKey should be within the range of 4-13. I didn't test the extreme condition (max,max). Need to adjust it.
Also, my email is in research paper.
I was able to get the points assigned to bins correctly. The next problem I see is the bins' dataBegin
and dataEnd
properties all being set to zero. Also, the dataKey
values are never used by DBSCAN, so we never use the mapping of points to bins?
I see. I fixed the problem. It was some memory issue.
dataKey is not used in DBSCAN, because the data are mapped in the range of indexBuckets, dataBegin & dataEnd. dataBegin and dataEnd are indexes of dataset.
Hi Madhav. I'm getting this cluster output for the data I provided above:
-2
7
13
7
7
7
13
13
13
These are my common.h
settings:
#define RANGE 2
#define UNPROCESSED -1
#define NOISE -2
#define DIMENSION 2
#define TREE_LEVELS (DIMENSION + 1)
#define THREAD_BLOCKS 3
#define THREAD_COUNT 9
#define MAX_SEEDS 128
#define EXTRA_COLLISION_SIZE 512
#define DATASET_COUNT 9
#define MINPTS 3
#define EPS 1.5
#define PARTITION_SIZE 3
#define POINTS_SEARCHED 3
The cluster numbers don't make any sense. I'm trying to figure out how to get correct results. Do you think the issue is my common.h
settings or that there are more bugs that need to be addressed?
Note: I'm running this on CUDA 11.2.
Common file seems good except that POINTS_SEARCHED should be 9 for 2D and 27 for 3D. As it is number of cells around a cell to do the range search.
The experiment is executed on CUDA 11.3.
The thrust library equal_range function seems to give 0 even when the parameters are correct. It might be because of the thrust upgrade. I'm not sure which thrust version i specifically used. Now a days, It seems to come with CUDA installation. I installed it manually.
Hello. I'm conducting research under Dr. Gowanlock and I'm working with fast-cuda-gpu-dbscan. I'm having trouble getting the application to find clusters.
My scenario:
Are there any requirements I'm missing? Any help is greatly appreciated.
Note: I've confirmed the expected number of clusters using sklearn's dbscan.
common.h
output.txt