Open sandeepdoctily opened 6 years ago
DynamoDBInputFormat is not implemented yet, but could be implemented by copying from or depending on the DynamoDB EMR connector. https://github.com/awslabs/emr-dynamodb-connector/blob/master/emr-dynamodb-hadoop/src/main/java/org/apache/hadoop/dynamodb/read/DynamoDBInputFormat.java
Is DynamoDBInputFormat now implemented?
Hi All
I am trying to configure sparkGraphComputer with DYnamodb local. Please find below the configuration. Kindly help me out.
TinkerPop Hadoop Graph for OLAP
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
Set the default OLAP computer for graph.traversal().withComputer()
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
gremlin.hadoop.graphInputFormat=org.apache.hadoop.dynamodb.read.DynamoDBInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat
####################################
SparkGraphComputer Configuration
####################################
spark.master=local[*]
spark.executor.memory=200m
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.akka.timeout=500000
spark.kryo.registrationRequired=false
spark.storage.memoryFraction=0.2
spark.eventLog.enabled=true
spark.eventLog.dir=/tmp/spark-event-logs
spark.ui.killEnabled=true
spark.dynamicAllocation.enabled=false
spark.network.timeout=60000
spark.rpc.askTimeout=80000
spark.sql.broadcastTimeout=90000
spark.serializer=org.apache.spark.serializer.KryoSerializer
janusgraphmr.ioformat.conf.storage.backend==com.amazon.janusgraph.diskstorage.dynamodb.DynamoDBStoreManager
janusgraphmr.ioformat.conf.storage.dynamodb.client.credentials.class-name=com.amazonaws.auth.BasicAWSCredentials
janusgraphmr.ioformat.conf.storage.dynamodb.client.credentials.constructor-args=access,secret
janusgraphmr.ioformat.conf.storage.dynamodb.client.signing-region=us-east-1
janusgraphmr.ioformat.conf.storage.dynamodb.client.endpoint=http://localhost:8000
gremlin.graph=org.janusgraph.core.JanusGraphFactory
metrics.enabled=true
metrics.prefix=j
metrics.csv.interval=1000
metrics.csv.directory=metrics
storage.write-time=1 ms
storage.read-time=1 ms
storage.backend=com.amazon.janusgraph.diskstorage.dynamodb.DynamoDBStoreManager
storage.dynamodb.client.credentials.class-name=com.amazonaws.auth.BasicAWSCredentials
storage.dynamodb.client.credentials.constructor-args=access,secret
storage.dynamodb.client.signing-region=us-east-1
storage.dynamodb.client.endpoint=http://localhost:8000
When I run a query I get the below expection: gremlin> g.V().count()
java.lang.RuntimeException: class org.apache.hadoop.dynamodb.read.DynamoDBInputFormat not org.apache.hadoop.mapreduce.InputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2221) at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$0(SparkGraphComputer.java:177)