alitouka / spark_dbscan

DBSCAN clustering algorithm on top of Apache Spark
Apache License 2.0
255 stars 115 forks source link

DBSCAN Streaming #9

Open eintopf opened 9 years ago

eintopf commented 9 years ago

Hi,

i'm currently searching for a way to use DBSCAN in a streaming environment (similar to k-means streaming). I found a "workaround" to get it running with foreachRDD etc., but I think in a fully distributed environment this won't work. Furthermore it seems to be ultra-unperformant (because model is created on every new batch).

Are there any plans for a streaming implementation of this? Or any hints where to start?

Best regards, eintopf

AdrianP- commented 8 years ago

+1 In fact, I currently work on this. Any help?

eintopf commented 8 years ago

@AdrianP- So from my side I could test / give feedback, maybe do little tasks. Unfourtunately currently my overall Spark experience is quite limited. Do you have a repo with your progress?