JanusGraph / janusgraph

JanusGraph: an open-source, distributed graph database
https://janusgraph.org
Other
5.32k stars 1.17k forks source link

Importing data using spark with exception #1055

Open chaipugao opened 6 years ago

chaipugao commented 6 years ago

this is my hadoop-load.properties config file:

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphInputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat gremlin.hadoop.graphOutputFormat=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat

gremlin.hadoop.inputLocation=./data/grateful-dead.kryo

gremlin.hadoop.inputLocation=/user/zhizi/janusgraph/data/grateful-dead.kryo gremlin.hadoop.outputLocation=/user/zhizi/janusgraph/output gremlin.hadoop.jarsInDistributedCache=true

giraph.minWorkers=2 giraph.maxWorkers=2 giraph.useOutOfCoreGraph=true giraph.useOutOfCoreMessages=true mapred.map.child.java.opts=-Xmx1024m mapred.reduce.child.java.opts=-Xmx1024m giraph.numInputThreads=4 giraph.numComputeThreads=4 giraph.maxMessagesInMemory=100000

spark.master=local[*]

spark.master=spark://172.22.1.100:7077 spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.eventLog.enabled=true spark.eventLog.dir=./log

this is the error log:

17:52:54 WARN org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 1.2 (TID 9, s001102.bj1.lwsite.net): java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory] at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:82) at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:70) at org.apache.tinkerpop.gremlin.process.computer.bulkloading.BulkLoaderVertexProgram.workerIterationStart(BulkLoaderVertexProgram.java:171) at org.apache.tinkerpop.gremlin.spark.process.computer.SparkExecutor.lambda$executeVertexProgramIteration$35c6b113$1(SparkExecutor.java:103) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$9$1.apply(JavaRDDLike.scala:215) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$9$1.apply(JavaRDDLike.scala:215) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:78) ... 23 more Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69) at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477) at org.janusgraph.diskstorage.Backend.getIndexes(Backend.java:464) at org.janusgraph.diskstorage.Backend.(Backend.java:149) at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.getBackend(GraphDatabaseConfiguration.java:1897) at org.janusgraph.graphdb.database.StandardJanusGraph.(StandardJanusGraph.java:136) at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164) at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133) at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:113) ... 28 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58) ... 36 more Caused by: java.lang.NoSuchMethodError: org.apache.http.util.Asserts.check(ZLjava/lang/String;Ljava/lang/Object;)V at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90) at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123) at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:343) at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:325) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:218) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:191) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:153) at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.getMajorVersion(RestElasticSearchClient.java:109) at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.(RestElasticSearchClient.java:92) at org.janusgraph.diskstorage.es.ElasticSearchSetup$1.connect(ElasticSearchSetup.java:73) at org.janusgraph.diskstorage.es.ElasticSearchIndex.interfaceConfiguration(ElasticSearchIndex.java:291) at org.janusgraph.diskstorage.es.ElasticSearchIndex.(ElasticSearchIndex.java:203) ... 41 more

how to Fix this exception?

thx

Miroka96 commented 6 years ago

I have had a similar error, because the Apache Spark Dependencies (2.2.0 on my system) get into conflict with the JanusGraph dependencies. Because of different dependency versions it is possible that some methods are missing.

I solved it by shadowing some dependencies with the Maven Shade Plugin.

Have a look at my repository: https://github.com/Miroka96/janusgraph Recompile it for yourself and import in locally into Spark

pluradj commented 6 years ago

@Miroka96 Would you consider contributing a patch to JanusGraph? Seems like your work might satisfy #114.

Miroka96 commented 6 years ago

@pluradj It could take a while for me to create a patch, because I have never done it before and for this month I have too much to do (have to finish my bachelor).

Despite that, my fix is only working in the janusgraph-all module, not in the others. A better solution would be to find the relevant parts in the pom-files of the other modules. (This can also be a new issue marked as code improvement)