kuzeko / graph-databases-testsuite

Docker Images, installation scripts, and testing & benchmarking suite for Graph Databases
https://graphbenchmark.com
MIT License
36 stars 9 forks source link

V2: NN* queries and sampling #32

Closed lucassardois closed 3 years ago

lucassardois commented 3 years ago

Hello,

While browsing the repo source code for the V2 benchmark I landed here https://github.com/kuzeko/graph-databases-testsuite/blob/133116cbcc1c61cf088441e1b5907c4fbd4531f1/SHELLS/common/src/main/java/com/graphbenchmark/queries/vldb/NNincoming.java#L23

How does this works with the generated sample file? For instance, the benchmark generated the following sample file for the air-routes.json dataset:

{
    "nodes": [
      3545,
      3542,
      3612
    ],
    "node_labels": [
      "version",
      "continent"
    ],
    "node_props": [
      {
        "label": "version",
        "name": "author",
        "type": "java.lang.String"
      },
      {
        "label": "version",
        "name": "code",
        "type": "java.lang.String"
      },
      {
        "label": "version",
        "name": "type",
        "type": "java.lang.String"
      }
    ],
    "edges": [
      {
        "source": 3735,
        "target": 98,
        "label": "contains"
      },
      {
        "source": 3737,
        "target": 159,
        "label": "contains"
      },
      {
        "source": 3737,
        "target": 99,
        "label": "contains"
      }
    ],
    "edge_labels": [
      "contains",
      "route"
    ],
    "edge_props": [
      {
        "label": "route",
        "name": "dist",
        "type": "java.lang.Integer"
      }
    ],
    "paths": [
      {
        "source_id": 3735,
        "target_id": 5,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 6,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 7,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 8,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 9,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 10,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 11,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 12,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 13,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      },
      {
        "source_id": 3735,
        "target_id": 14,
        "source_lbl": "continent",
        "target_lbl": "airport",
        "sequence": [
          "contains",
          "route",
          "route",
          "route",
          "route"
        ]
      }
    ],
    "max_uid": 3741,
    "sample_id": "617cdf6c-5abd-4cb8-86be-5bdf5628ec01"
  }

Yet, when the query is executed in the benchmark logs I can see that the node id used for thoses queries is not the one defined in the sample file:

052dcde0-2407-4412-ac21-1794c270b960;291;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 0}];OK;--;29ms;3559 has 7 outgoing edges;{"node":0}
052dcde0-2407-4412-ac21-1794c270b960;291;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 0}];OK;TOTAL;2836ms;--;[{"node": 0}]
052dcde0-2407-4412-ac21-1794c270b960;292;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 1}];OK;--;31ms;3546 has 2 outgoing edges;{"node":1}
052dcde0-2407-4412-ac21-1794c270b960;292;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 1}];OK;TOTAL;3084ms;--;[{"node": 1}]
052dcde0-2407-4412-ac21-1794c270b960;293;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 2}];OK;--;29ms;3555 has 11 outgoing edges;{"node":2}
052dcde0-2407-4412-ac21-1794c270b960;293;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 2}];OK;TOTAL;2970ms;--;[{"node": 2}]
052dcde0-2407-4412-ac21-1794c270b960;294;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 0}];OK;--;29ms;3559 has 7 outgoing edges;{"node":0}
052dcde0-2407-4412-ac21-1794c270b960;294;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 0}];OK;TOTAL;3054ms;--;[{"node": 0}]
052dcde0-2407-4412-ac21-1794c270b960;295;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 1}];OK;--;31ms;3546 has 2 outgoing edges;{"node":1}
052dcde0-2407-4412-ac21-1794c270b960;295;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 1}];OK;TOTAL;2920ms;--;[{"node": 1}]
052dcde0-2407-4412-ac21-1794c270b960;296;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 2}];OK;--;30ms;3555 has 11 outgoing edges;{"node":2}
052dcde0-2407-4412-ac21-1794c270b960;296;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 2}];OK;TOTAL;2926ms;--;[{"node": 2}]
052dcde0-2407-4412-ac21-1794c270b960;297;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 0}];OK;--;30ms;3559 has 7 outgoing edges;{"node":0}
052dcde0-2407-4412-ac21-1794c270b960;297;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 0}];OK;TOTAL;3073ms;--;[{"node": 0}]
052dcde0-2407-4412-ac21-1794c270b960;298;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 1}];OK;--;29ms;3546 has 2 outgoing edges;{"node":1}
052dcde0-2407-4412-ac21-1794c270b960;298;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 1}];OK;TOTAL;3053ms;--;[{"node": 1}]
052dcde0-2407-4412-ac21-1794c270b960;299;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 2}];OK;--;31ms;3555 has 11 outgoing edges;{"node":2}
052dcde0-2407-4412-ac21-1794c270b960;299;Neo4jShell;air-routes.json;dbf21247-0b6b-4123-aedd-a67968a9ba8d;;queries.vldb.NNoutgoing;vldb19-7-dirty;1800;SINGLE_SHOT;[{"node": 2}];OK;TOTAL;2965ms;--;[{"node": 2}]

We can see that the node id (3546 for node 0) used to make the query is not the same as the one used from the sample file (3545 for node 0).

Can you please explain how the benchmark pick up nodes for this query?

lucassardois commented 3 years ago

Was using different samples files...