kuzeko / graph-databases-testsuite

Docker Images, installation scripts, and testing & benchmarking suite for Graph Databases
https://graphbenchmark.com
MIT License
35 stars 9 forks source link

V2: ArangoDB & Janusgraph : IllegalArgumentException: Vertex Label with given name does not exist: person #27

Open lucassardois opened 2 years ago

lucassardois commented 2 years ago

Hello,

When working on the air-routes.json both arangodb & janusgraph crash on the query NNoutgoingUniqueLabel. Logs for arrangodb:

...
15:09:26| INFO     - Current query is com.graphbenchmark.queries.vldbj.NodePathLabelSearch                                                                                                                                                   
15:09:26| INFO     - Current mode is SINGLE_SHOT                                                                                                                                                                                             
15:09:26| DEBUG    - []                                                                                                                                                                                                                      
15:09:26| INFO     - Current mode is SINGLE_SHOT                                                                                                                                                                                             
15:09:26| DEBUG    - []                                                                                                                                                                                                                      
15:09:26| INFO     - Current mode is SINGLE_SHOT                                                                                                                                                                                             
15:09:26| DEBUG    - []                                                                                                                                                                                                                      
15:09:26| INFO     - Current query is com.graphbenchmark.queries.MixCountInsertSchema                                                                                                                                                        
15:09:26| INFO     - Current mode is SINGLE_SHOT                                                                                                                                                                                             
15:09:26| DEBUG    - [{}]                                                                                                                                                                                                                    
15:09:30| ERROR    - Error in query. Code: 1                                                                                                                                                                                                 
15:09:30| CRITICAL - [ARANGODB-INIT] starting over with existing database
[ARANGODB-INIT] Will try to enable NUMA interleave on all nodes...
[ARANGODB-INIT] error: cannot use numactl, if you are sure that it is supported by your hardware and system, please check that container is running with --security-opt seccomp=unconfined                                                   
[ARANGODB-INIT] Arango up in 2 seconds
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
java.lang.IllegalArgumentException: Vertex label (person) not in graph (MAIN_GRAPH) vertex collections.
        at com.arangodb.tinkerpop.gremlin.structure.ArangoDBGraph.addVertex(ArangoDBGraph.java:628)
        at com.graphbenchmark.queries.MixCountInsertSchema.lambda$query$0(MixCountInsertSchema.java:57)
        at com.graphbenchmark.common.RetryTransactionCtx.retry(RetryTransactionCtx.java:31)
        at com.graphbenchmark.queries.MixCountInsertSchema.query(MixCountInsertSchema.java:44)
        at com.graphbenchmark.queries.MixCountInsertSchema.query(MixCountInsertSchema.java:20)
        at com.graphbenchmark.common.GenericQuery._execute(GenericQuery.java:41)
        at com.graphbenchmark.common.GenericQuery.execute(GenericQuery.java:31)
        at com.graphbenchmark.common.GenericShell.lambda$exec_one_shot$2(GenericShell.java:204)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:830)

15:09:30| CRITICAL - Irrecoverable error

Logs for janusgraph:

...
10:05:54| INFO     - Current query is com.graphbenchmark.queries.MixCountInsertSchema                                                                                                                                                         
10:05:54| INFO     - Current mode is SINGLE_SHOT                                                                                                                                                                                              
10:05:54| DEBUG    - [{}]                                                                                                                                                                                                                     
10:06:15| ERROR    - Error in query. Code: 1 
java.lang.IllegalArgumentException: Vertex Label with given name does not exist: person
        at org.janusgraph.graphdb.types.typemaker.DisableDefaultSchemaMaker.makeVertexLabel(DisableDefaultSchemaMaker.java:52)                                                                                                               
        at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.getOrCreateVertexLabel(StandardJanusGraphTx.java:1068)                                                                                                                    
        at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsTransaction.addVertex(JanusGraphBlueprintsTransaction.java:117)                                                                                                              
        at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph.addVertex(JanusGraphBlueprintsGraph.java:141)
        at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph.addVertex(JanusGraphBlueprintsGraph.java:59)
        at com.graphbenchmark.queries.MixCountInsertSchema.lambda$query$0(MixCountInsertSchema.java:57)
        at com.graphbenchmark.common.RetryTransactionCtx.retry(RetryTransactionCtx.java:31)
        at com.graphbenchmark.queries.MixCountInsertSchema.query(MixCountInsertSchema.java:44)
        at com.graphbenchmark.queries.MixCountInsertSchema.query(MixCountInsertSchema.java:20)
        at com.graphbenchmark.common.GenericQuery._execute(GenericQuery.java:41)
        at com.graphbenchmark.common.GenericQuery.execute(GenericQuery.java:31)
        at com.graphbenchmark.common.GenericShell.lambda$exec_one_shot$2(GenericShell.java:204)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

Both errors seems to point to the same issue. Queries try to found a vertex with a label person wich does not exists in the air-routes.json dataset (wich exists only in the tinkerpop_modern.json dataset).

The same queries does not crash for others databases system.

MartinBrugnara commented 2 years ago

No, they are actually both crashing on MixCountInsertSchema.

Please avoid Mix* queries, they are not ready yet. We have experienced many issues with several systems, expeciaclly in the concurrent schenario.

Settings and sampling are probably wrong

I wouldn't expect these lines in the log

15:09:26| INFO     - Current query is com.graphbenchmark.queries.vldbj.NodePathLabelSearch                                                                                                                                                   
15:09:26| INFO     - Current mode is SINGLE_SHOT                                                                                                                                                                                             
15:09:26| DEBUG    - [] 

The array in the DEBUG line should contain values. Could you please share config and sample file?

lucassardois commented 2 years ago

Ok this makes sense, I'm gonna remove thoses queries.

My config file:

queries = [ "com.graphbenchmark.queries.vldbj.NodePathLabelSearchOut", "com.graphbenchmark.queries.vldbj.NodeLabelledPropertySearch", "com.graphbenchmark.queries.vldb.NodePropertySearch", "com.graphbenchmark.queries.vldb.KDegreeOut", "com
.graphbenchmark.queries.vldb.ShortestPathFiltered", "com.graphbenchmark.queries.vldbj.NodeKReachability", "com.graphbenchmark.queries.vldb.BFS", "com.graphbenchmark.queries.vldb.InsertNodeProperty", "com.graphbenchmark.queries.vldb.NNinco
mingUniqueLabel", "com.graphbenchmark.queries.vldb.CountNodes", "com.graphbenchmark.queries.vldb.InsertEdgeWithProperty", "com.graphbenchmark.queries.vldb.NNbothUniqueLabel", "com.graphbenchmark.queries.vldb.IdSearchEdge", "com.graphbench
mark.queries.vldb.DeleteEdgeProperty", "com.graphbenchmark.queries.vldbj.NodePathLabelSearch", "com.graphbenchmark.queries.MixCountInsertSchema", "com.graphbenchmark.queries.vldb.EdgeLabelSearch", "com.graphbenchmark.queries.vldbj.InsertN
ode", "com.graphbenchmark.queries.vldbj.AllNodeLabelPatternSquare", "com.graphbenchmark.queries.vldbj.NodeLabelSearch", "com.graphbenchmark.queries.vldbj.Noop", "com.graphbenchmark.queries.vldb.CountEdges", "com.graphbenchmark.queries.vld
b.InsertNodeWithProperty", "com.graphbenchmark.queries.vldbj.NodeKReachabilityOut", "com.graphbenchmark.queries.vldb.KDegreeIn", "com.graphbenchmark.queries.vldb.DeleteNode", "com.graphbenchmark.queries.vldb.InsertNodeWithEdges", "com.gra
phbenchmark.queries.MixCountDelete", "com.graphbenchmark.queries.vldbj.NodeLabelledKReachability", "com.graphbenchmark.queries.vldb.UpdateNodeProperty", "com.graphbenchmark.queries.vldb.UpdateEdgeProperty", "com.graphbenchmark.queries.vld
b.BFSfiltered", "com.graphbenchmark.queries.vldb.NNincoming", "com.graphbenchmark.queries.vldbj.NodePatternTriangle", "com.graphbenchmark.queries.vldb.KDegreeBoth", "com.graphbenchmark.queries.vldbj.AllEdgeLabelPatternSquare", "com.graphb
enchmark.queries.vldbj.NodePatternSquare", "com.graphbenchmark.queries.vldb.ShortestPath", "com.graphbenchmark.queries.vldb.CountNonRoot", "com.graphbenchmark.queries.vldb.NodeUIDSearch", "com.graphbenchmark.queries.vldb.NNbothFiltered", 
"com.graphbenchmark.queries.vldbj.AllEdgeLabelPatternTriangle", "com.graphbenchmark.queries.vldb.IdSearchNode", "com.graphbenchmark.queries.vldb.DeleteNodeProperty", "com.graphbenchmark.queries.vldbj.NodeLabelledKReachabilityOut", "com.gr
aphbenchmark.queries.vldbj.CountUniqueNodeLabels", "com.graphbenchmark.queries.MixCountInsertSimple", "com.graphbenchmark.queries.vldb.NNoutgoingUniqueLabel", "com.graphbenchmark.queries.vldb.NNoutgoing", "com.graphbenchmark.queries.vldb.
InsertEdgeProperty", "com.graphbenchmark.queries.vldb.CountUniqueEdgeLabels", "com.graphbenchmark.queries.vldb.InsertEdge", "com.graphbenchmark.queries.vldb.EdgePropertySearch", "com.graphbenchmark.queries.vldbj.AllNodeLabelPatternTriangl
e",]                                                                                                                                                                                                                                          
warmup = []                                                                                                                                                                                                                                   
loader = "com.graphbenchmark.queries.mgm.Load"                                                                                                                                                                                                
sampler = "com.graphbenchmark.queries.mgm.Sampler"                                                                                                                                                                                            
mapsample = "com.graphbenchmark.queries.mgm.MapSample"                                                                                                                                                                                        
sample_id = "dbf21247-0b6b-4123-aedd-a67968a9ba8d"                                                                                                                                                                                            
modes = [ "SINGLE_SHOT",]                                                                                                                                                                                                                     
iterations = 3                                                                                                                                                                                                                                
threads = 3                                                                                                                                                                                                                                   
jvm_opts = "-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions"                                                                                                                                                                                    

[sampling]                                                                                                                                                                                                                                    
nodes = 3                                                                                                                                                                                                                                     
node_labels = 2                                                                                                                                                                                                                               
node_props = 3                                                                                                                                                                                                                                
edges = 3                                                                                                                                                                                                                                     
edge_labels = 2
edge_props = 0
paths = 0
paths_max_len = 5

[timeout]
single_shot = 1800
batch = 10800
concurrent = 3600
load = 86400
extra_container = 600
consecutive = 3

[cnt_opts]
security_opt = [ "seccomp=unconfined",]
mem_limit = 267252325908
mem_swappiness = 1
cap_add = [ "IPC_LOCK",]

[databases.neo4j]
image = "graphbenchmark.com/neo4j:latest"

[databases.orientdb]
image = "graphbenchmark.com/orientdb:latest"
jvm_opts = "-XX:+UseG1GC -XX:MaxDirectMemorySize=512m"

[databases.arangodb]
image = "graphbenchmark.com/arangodb:latest"

[databases.sqlgpg]
image = "graphbenchmark.com/sqlgpg:latest"

# [databases.janusgraph]
# image = "graphbenchmark.com/janusgraph:latest"

# [datasets."tinkerpop-modern_mod.json"]
# path = "/runtime/data/tinkerpop-modern_mod.json"
# uid_field = "uid"

[datasets."air-routes.json"]
path = "/runtime/data/air-routes.json"
uid_field = "uid"

# [datasets."dbpedia.escaped.json"]
# path = "/runtime/data/dbpedia.escaped.json"
# uid_field = "uid"
#
# [datasets."ldbc.scale10.escaped.json"]
# path = "/runtime/data/ldbc.scale10.escaped.json"
# uid_field = "uid"

[cnt_opts.ulimits.memlock]
hard = -1
soft = -1

Sample for tinkerpop-modern:

{"nodes":[5,2,4],"node_labels":["software","person"],"node_props":[{"label":"person","name":"uid","type":"java.lang.Long"},{"label":"software","name":"lang","type":"java.lang.String"},{"label":"person","name":"name","type":"java.lang.String"}],"edges":[{"source":1,"target":4,"label":"knows"},{"source":4,"target":3,"label":"created"},{"source":6,"target":3,"label":"created"}],"edge_labels":["knows","created"],"edge_props":[],"paths":[],"max_uid":6,"sample_id":"dbf21247-0b6b-4123-aedd-a67968a9ba8d"}

Sample for air-routes:

{"nodes":[3559,3546,3555],"node_labels":["continent","airport"],"node_props":[{"label":"airport","name":"code","type":"java.lang.String"},{"label":"airport","name":"elev","type":"java.lang.Integer"},{"label":"airport","name":"type","type":"java.lang.String"}],"edges":[{"source":3737,"target":33,"label":"contains"},{"source":3506,"target":133,"label":"contains"},{"source":3692,"target":293,"label":"contains"}],"edge_labels":["contains","route"],"edge_props":[],"paths":[],"max_uid":3741,"sample_id":"dbf21247-0b6b-4123-aedd-a67968a9ba8d"}
MartinBrugnara commented 2 years ago

Ok, the log makes sense because the sample has no paths.

Just to ensure everything else is setup correctly, could you please try to execute just queries = ["com.graphbenchmark.queries.vldb.IdSearchNode"] and check the that log actually contains values in the parameters debug line (or just paste here the log).

lucassardois commented 2 years ago

Ok, the log makes sense because the sample has no paths. Just to ensure everything else is setup correctly, could you please try to execute just queries = ["com.graphbenchmark.queries.vldb.IdSearchNode"] and check the that log actually contains values in the parameters debug line (or just paste here the log).

Here we go:

15:19:01| DEBUG    - Shells are running with -d
15:19:01| DEBUG    - Main benchamark loop
15:19:01| INFO     - Current dataset is tinkerpop-modern_mod.json
15:19:01| INFO     - Overriding java options: -XX:+UseG1GC -XX:MaxDirectMemorySize=512m
15:19:01| INFO     - Current database is orientdb
15:19:01| DEBUG    - Get queries execution configuration list
15:19:03| DEBUG    - Runnig with query version: vldb19-7-dirty
15:19:03| INFO     - Current query is com.graphbenchmark.queries.vldb.IdSearchNode
15:19:03| INFO     - Current mode is SINGLE_SHOT
15:19:03| DEBUG    - [{'node': 0}, {'node': 1}, {'node': 2}]

So this looks fine to me?

MartinBrugnara commented 2 years ago

Yes, indeed. This seams perfect.