awslabs / amazon-neptune-tools

Tools and utilities to enable loading data and building graph applications with Amazon Neptune.
Apache License 2.0
297 stars 151 forks source link

[export-neptune-to-elasticsearch] Error exporting nodes #68

Closed ctSkennerton closed 4 years ago

ctSkennerton commented 4 years ago

When using the neptune to elasticsearch solution I found that the elasticsearch index appeared to be missing a lot of data. Going back through the logs I see that the export neptune batch job succeeded but contained the following stacktrace


[main] INFO com.amazonaws.services.neptune.propertygraph.RangeFactory - Limit: 319015, Size: 159508

[pool-6-thread-2] INFO com.amazonaws.services.neptune.propertygraph.NodesClient - __.V().range(0L,159508L).project("id","label","properties").by(T.id).by(T.label).by(__.valueMap())

[pool-6-thread-1] INFO com.amazonaws.services.neptune.propertygraph.NodesClient - __.V().range(159508L,319015L).project("id","label","properties").by(T.id).by(T.label).by(__.valueMap())

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer

    at com.amazonaws.services.neptune.propertygraph.metadata.DataType$5.printTo(DataType.java:81)

    at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printRecord(NeptuneStreamsJsonPropertyGraphPrinter.java:133)

    at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printRecord(NeptuneStreamsJsonPropertyGraphPrinter.java:114)

    at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printProperties(NeptuneStreamsJsonPropertyGraphPrinter.java:71)

    at com.amazonaws.services.neptune.propertygraph.io.NodeWriter.handle(NodeWriter.java:36)

    at com.amazonaws.services.neptune.propertygraph.io.NodeWriter.handle(NodeWriter.java:18)

    at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.handle(ExportPropertyGraphTask.java:91)

    at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask$CountingHandler.handle(ExportPropertyGraphTask.java:132)

    at com.amazonaws.services.neptune.propertygraph.NodesClient.lambda$queryForValues$1(NodesClient.java:89)

    at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:272)

    at com.amazonaws.services.neptune.propertygraph.NodesClient.queryForValues(NodesClient.java:87)

    at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.run(ExportPropertyGraphTask.java:71)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

    at java.lang.Thread.run(Thread.java:748)

[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2020-08-17 22:49:35.527651] [0x00000022][0x00007f67a76887c0] [info] [kinesis_producer.cc:200] Created pipeline for stream "neptune-export-cdde6d20"

[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2020-08-17 22:49:35.527741] [0x00000022][0x00007f67a76887c0] [info] [shard_map.cc:87] Updating shard map for stream "neptune-export-cdde6d20"

[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2020-08-17 22:49:35.552757] [0x00000022][0x00007f67a1e7b700] [info] [shard_map.cc:148] Successfully updated shard map for stream "neptune-export-cdde6d20" found 8 shards

[gremlin-driver-loop-1] ERROR org.apache.tinkerpop.gremlin.driver.Handler$GremlinResponseHandler - Could not process the response

io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

    at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.protocolViolation(WebSocket08FrameDecoder.java:426)

    at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.decode(WebSocket08FrameDecoder.java:286)

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475)

    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1224)

    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1271)

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)

    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)

    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)

    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)

    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)

    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)

    at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)

    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

    at java.lang.Thread.run(Thread.java:748)

java.util.concurrent.CompletionException: io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

    at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:375)

    at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947)

    at org.apache.tinkerpop.gremlin.driver.ResultSet.one(ResultSet.java:119)

    at org.apache.tinkerpop.gremlin.driver.ResultSet$1.hasNext(ResultSet.java:171)

    at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:178)

    at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:165)

    at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:146)

    at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:131)

    at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal.nextTraverser(DriverRemoteTraversal.java:112)

    at org.apache.tinkerpop.gremlin.process.remote.traversal.step.map.RemoteStep.processNextStart(RemoteStep.java:80)

    at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)

    at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)

    at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:205)

    at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:272)

    at com.amazonaws.services.neptune.propertygraph.NodesClient.queryForValues(NodesClient.java:87)

    at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.run(ExportPropertyGraphTask.java:71)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

    at java.lang.Thread.run(Thread.java:748)

Caused by: io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

    at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.protocolViolation(WebSocket08FrameDecoder.java:426)

    at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.decode(WebSocket08FrameDecoder.java:286)

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475)

    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1224)

    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1271)

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)

    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)

    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)

    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)

    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)

    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)

    at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)

    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

    ... 1 more

At the end of the log I can also see the following which says that most of the nodes of the graph were not exported:

Config file : /neptune/results/1597704571888/config.json
--
 Source:
  Nodes: 319015
  Edges: 739118
Export:
  Nodes: 74
  Edges: 739118

Can you advise on how I could solve this issue?

upadhyay-prashant commented 4 years ago

Thanks Connor for reporting this. We will be looking into this soon.

since the exception message is due to maxContentLength, as a quick work-around can you try changing https://github.com/awslabs/amazon-neptune-tools/blob/master/export-neptune-to-elasticsearch/lambda/export_neptune_to_kinesis.py#L42 to have max-content-length parameter.

command = 'df -h && wget {} && export SERVICE_REGION="{}" && java -Xms8g -Xmx8g -jar neptune-export.jar {} -e {} -p {} -d /neptune/results --output stream --stream-name {} --region {} --max-content-length 2147483647 --format neptuneStreamsJson --log-level info --use-ssl{}{}{}'.format(
ctSkennerton commented 4 years ago

That did seem to improve things but not completely. I still get the following stacktrace

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
  at com.amazonaws.services.neptune.propertygraph.metadata.DataType$5.printTo(DataType.java:81)
  at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printRecord(NeptuneStreamsJsonPropertyGraphPrinter.java:133)
  at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printRecord(NeptuneStreamsJsonPropertyGraphPrinter.java:114)
  at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printProperties(NeptuneStreamsJsonPropertyGraphPrinter.java:71)
  at com.amazonaws.services.neptune.propertygraph.io.NodeWriter.handle(NodeWriter.java:36)
  at com.amazonaws.services.neptune.propertygraph.io.NodeWriter.handle(NodeWriter.java:18)
  at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.handle(ExportPropertyGraphTask.java:91)
  at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask$CountingHandler.handle(ExportPropertyGraphTask.java:132)
  at com.amazonaws.services.neptune.propertygraph.NodesClient.lambda$queryForValues$1(NodesClient.java:89)
  at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:272)
  at com.amazonaws.services.neptune.propertygraph.NodesClient.queryForValues(NodesClient.java:87)
  at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.run(ExportPropertyGraphTask.java:71)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)

At the end I see the following, so about half the nodes are exported.

Source:
  Nodes: 319015
  Edges: 739118
Export:
  Nodes: 159581
  Edges: 739118
ankit370 commented 4 years ago

Looks like type cast error while reading data from Neptune. It would help if you can provide small reproducer.

iansrobinson commented 4 years ago

Thanks @ctSkennerton – I think I've reproduced the issue, and have pushed a fix.

I believe you may have some set cardinality properties with values of different types. The exporter was inferring the type of the values in the set based only on the first value: if this was an integer, but a subsequent value in the set was a string, the tool would raise the error you reported.

I've updated the exporter so that when it publishes these set cardinality properties, it identifies the type of each value in the set.

Note that the Neptune/Elasticsearch integration today only indexes string values – see https://docs.aws.amazon.com/neptune/latest/userguide/full-text-search-model.html – but the export part of this backfill solution will now accommodate set cardinality properties containing values of different types.

ctSkennerton commented 4 years ago

Thank you @iansrobinson that fixed my issue.