NP-Omix / BioCompass

Other
5 stars 3 forks source link

Nonsense epsilon #53

Open castelao opened 7 years ago

castelao commented 7 years ago

Doesn't make sense to scale epsilon in DBSCAN by the size of the gene comparison matrix. It should be instead somehow related to the scores inside that matrix (A).

at subcluster_gen.py:

...
for itn in range(1,len(A)):
    db = DBSCAN(eps=itn, min_samples=2).fit_predict(A)
...

It was a coincidence that most cases used so far compared 10 genes at a time, and the empirical score range was close to 10. Therefore, the results so far were valid, but it's not generalized for other cases.