socialsensor / graphdb-benchmarks

Performance benchmark between popular graph databases.
Apache License 2.0
194 stars 81 forks source link

Shortest path implemented in TP3, but really slow #18

Open amcp opened 8 years ago

amcp commented 8 years ago

Original shortest path traversal: https://github.com/socialsensor/graphdb-benchmarks/blob/a3cd0700b39c33364075a089889d648b7de5f7fd/src/main/java/eu/socialsensor/query/TitanQuery.java#L87

Original list of randomly chosen target nodes (range 2-1000, choose 100): https://github.com/socialsensor/graphdb-benchmarks/blob/a3cd0700b39c33364075a089889d648b7de5f7fd/src/main/java/eu/socialsensor/benchmarks/FindShortestPathBenchmark.java#L130

TP3 shortest path traversal: https://github.com/amcp/graphdb-benchmarks/blob/tp3/src/main/java/eu/socialsensor/graphdatabases/TitanGraphDatabase.java#L281

New method to choose source node: https://github.com/amcp/graphdb-benchmarks/blob/tp3/src/main/java/eu/socialsensor/graphdatabases/GraphDatabaseBase.java#L128

New method to choose target nodes (choose from entire range of vertices in dataset): https://github.com/amcp/graphdb-benchmarks/blob/tp3/src/main/java/eu/socialsensor/dataset/Dataset.java#L28

New method to choose number of random target nodes (set the number in configuration file): https://github.com/amcp/graphdb-benchmarks/blob/tp3/src/test/resources/META-INF/input.properties#L86

amcp commented 8 years ago

I am asking for help from the experts on the forum... https://groups.google.com/forum/#!topic/gremlin-users/rhIhEY9R4E0

amcp commented 8 years ago

I made progress and have fixed the traversal, but the QW-FS workload results are not consistent with the findings in your paper. Can you checkout my branch and confirm please?

sarovios commented 8 years ago

I'll do asap.

amcp commented 8 years ago

I've fixed the performance issue with QW-FS but the results still do not line up with the paper. There is also a bug I need to track down with the natural data sets - they hang on the first shortest path query, both for Enron and for Amazon.

For the time being, please take a look at the synthetic data set QW-FS results and let me know if you think they are reasonable.

Finally, I still cannot seem to make the Neo4J QW-FS work - it crashes. So that is still up for grabs. https://github.com/amcp/graphdb-benchmarks#miw--qw-results

sarovios commented 8 years ago

I am trying to run the Benchmark but I am having this exception: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain. I am not very familiar with AWS cloud services or dynamo-db so can you please supply instruction of how I can run it?

amcp commented 8 years ago

Sotiri, for the purposes of this test I think you can comment out eu.socialsensor.databases=tddb in input.properties; we are working with a graph (Enron) that fits in memory so i think tbdb would be enough for Titan in this case.