twitter / cassovary

Cassovary is a simple big graph processing library for the JVM
http://twitter.com/cassovary
Apache License 2.0
1.05k stars 150 forks source link

Implement some similarity algorithms #156

Closed pankajgupta closed 9 years ago

pankajgupta commented 9 years ago

Such as cosine similarity and jaccard similarity when given (1) Two nodes N1 and N2, find the similarity values between them in a given direction (Out or In) (2) One node N1, find the top K similar nodes to N1

Cosine(N1, N2) = neighbors(N1) ∩ neighbors(N2) divided by (sqrt(numNeighbors(N1)) * sqrt(numNeighbors(N2))

and Jaccard(N1,N2) = neighbors(N1) ∩ neighbors(N2) divided by neighbors(N1) ∪ neighbors(N2)

Please add the algorithms in com.twitter.cassovary.algorithms

Tooa commented 9 years ago

This could be closed, right? It seems to be already implemented by @AnishShah

pankajgupta commented 9 years ago

That's right. Closing it.