neo4j-contrib / neo4j-graph-algorithms

Efficient Graph Algorithms for Neo4j
https://github.com/neo4j/graph-data-science/
GNU General Public License v3.0
770 stars 195 forks source link

3.4 sourceids targetids #820

Closed mneedham closed 5 years ago

mneedham commented 5 years ago

This PR adds functionality discussed on the Neo4j community site - https://community.neo4j.com/t/algo-similarity-jaccard-stream-takes-more-than-3-minutes/4586/14

Things still to do:

mneedham commented 5 years ago

Simple benchmark with a dummy compute doesn't show much difference in the time taken to compute all pairs:

Benchmark                                                     (concurrency)  Mode   Cnt         Score         Error  Units
SimilarityStreamGeneratorBenchmark.allPairs                               1    ss  1000    138249.941 ±   16260.097  ns/op
SimilarityStreamGeneratorBenchmark.allPairs                               2    ss  1000  22361607.769 ± 2862052.577  ns/op
SimilarityStreamGeneratorBenchmark.allPairs                               8    ss  1000  19253922.198 ± 2185645.041  ns/op
SimilarityStreamGeneratorBenchmark.allPairsBlankSourceTarget              1    ss  1000     82898.555 ±    2002.081  ns/op
SimilarityStreamGeneratorBenchmark.allPairsBlankSourceTarget              2    ss  1000  18480648.936 ± 2548120.298  ns/op
SimilarityStreamGeneratorBenchmark.allPairsBlankSourceTarget              8    ss  1000  20026972.757 ± 2267054.050  ns/op

Will also test this out on the recipes dataset.

mneedham commented 5 years ago

graph-algorithms-algo-3.4.12.3.zip