JanusGraph / janusgraph

JanusGraph: an open-source, distributed graph database
https://janusgraph.org
Other
5.34k stars 1.18k forks source link

Using ConfiguredGraphFactory.drop failed when writingLogSetting #3106

Open liu-chao-simon opened 2 years ago

liu-chao-simon commented 2 years ago

Discussed in https://github.com/JanusGraph/janusgraph/discussions/3105

Originally posted by **liu-chao-simon** June 26, 2022 I am using dynamic graphs features of Janusgraph 0.6.2, i can create graph successfully. But from the janusgraph server side, it always show exception log when using ConfiguredGraphFactory.drop to drop a graph. I'm using CQL(a single node Cassandra) as storage backend, and here is the server configurations gremlin-server.yaml ``` host: 0.0.0.0 port: 8182 evaluationTimeout: 0 channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer graphManager: org.janusgraph.graphdb.management.JanusGraphManager graphs: ConfigurationManagementGraph: conf\janusgraph-cql.properties scriptEngines: gremlin-groovy: plugins: org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {} org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {} org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {} org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]} org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: []} serializers: - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} # Older serialization versions for backwards compatibility: - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1 } - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }} processors: - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }} - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }} metrics: consoleReporter: {enabled: true, interval: 180000} csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv} jmxReporter: {enabled: true} slf4jReporter: {enabled: true, interval: 180000} gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST} graphiteReporter: {enabled: false, interval: 180000} maxInitialLineLength: 4096 maxHeaderSize: 8192 maxChunkSize: 8192 maxContentLength: 65536000 maxAccumulationBufferComponents: 1024 resultIterationBatchSize: 64 writeBufferLowWaterMark: 32768 writeBufferHighWaterMark: 65536 threadPoolWorker: 4 gremlinPool: 4 ``` janusgraph-cql.properties ``` gremlin.graph=org.janusgraph.core.ConfiguredGraphFactory graph.graphname=ConfigurationManagementGraph storage.backend=cql storage.hostname=192.168.1.100:9042 ``` I can create graph successfully using scripts ``` map = new HashMap(); map.put("storage.backend", "cql"); map.put("storage.hostname", "192.168.1.100:9042"); ConfiguredGraphFactory.createTemplateConfiguration(new MapConfiguration(map)); g1 = ConfiguredGraphFactory.create("graph1"); ``` But, exception always happen on server side when drop graph by `ConfiguredGraphFactory.drop("graph1")` ``` 1354978 [pool-10-thread-1] INFO org.janusgraph.diskstorage.log.kcvs.KCVSLog - Loaded unidentified ReadMarker start time 2022-06-26T06:36:07.607Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@26790a9f 1357285 [pool-33-thread-1] INFO org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger - Graph a has been removed from the graph cache on every JanusGraph node in the cluster. 1357285 [pool-33-thread-1] WARN org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger - Attempting to drop the graph a. **1358411 [pool-33-thread-1] ERROR org.janusgraph.diskstorage.log.kcvs.KCVSLog - Reader thread pool for KCVSLog systemlog did not shut down in time - could not clean up or set read markers** 1358412 [pool-33-thread-1] INFO org.janusgraph.diskstorage.util.BackendOperation - Temporary exception during backend operation [writingLogSetting]. Attempting backoff retry. org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend at io.vavr.API$Match$Case0.apply(API.java:5135) at io.vavr.API$Match.of(API.java:5092) at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$static$0(CQLKeyColumnValueStore.java:120) at org.janusgraph.diskstorage.cql.function.mutate.AbstractCQLMutateManyUnloggedFunction.mutateMany(AbstractCQLMutateManyUnloggedFunction.java:64) at org.janusgraph.diskstorage.cql.CQLStoreManager.mutateMany(CQLStoreManager.java:302) at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.mutate(CQLKeyColumnValueStore.java:430) at org.janusgraph.diskstorage.log.kcvs.KCVSLog$3.call(KCVSLog.java:915) at org.janusgraph.diskstorage.log.kcvs.KCVSLog$3.call(KCVSLog.java:912) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:106) at org.janusgraph.diskstorage.util.BackendOperation$1.call(BackendOperation.java:120) at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:66) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:52) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:117) at org.janusgraph.diskstorage.log.kcvs.KCVSLog.writeSetting(KCVSLog.java:912) at org.janusgraph.diskstorage.log.kcvs.KCVSLog.lambda$close$0(KCVSLog.java:346) at org.janusgraph.util.system.ExecuteUtil.executeWithCatching(ExecuteUtil.java:32) at org.janusgraph.diskstorage.log.kcvs.KCVSLog.close(KCVSLog.java:344) at org.janusgraph.util.system.ExecuteUtil.executeWithCatching(ExecuteUtil.java:32) at org.janusgraph.diskstorage.log.kcvs.KCVSLogManager.close(KCVSLogManager.java:251) at org.janusgraph.util.system.ExecuteUtil.executeWithCatching(ExecuteUtil.java:32) at org.janusgraph.diskstorage.Backend.close(Backend.java:594) at org.janusgraph.util.system.IOUtils.closeQuietly(IOUtils.java:63) at org.janusgraph.graphdb.database.StandardJanusGraph.closeInternal(StandardJanusGraph.java:286) at org.janusgraph.graphdb.database.StandardJanusGraph.close(StandardJanusGraph.java:244) at org.janusgraph.core.JanusGraphFactory.drop(JanusGraphFactory.java:224) at org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger.call(ConfiguredGraphFactory.java:189) at org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger.call(ConfiguredGraphFactory.java:169) at org.janusgraph.graphdb.database.management.ManagementLogger$EvictionTrigger.runTriggers(ManagementLogger.java:191) at org.janusgraph.graphdb.database.management.ManagementLogger$EvictionTrigger.receivedAcknowledgement(ManagementLogger.java:183) at org.janusgraph.graphdb.database.management.ManagementLogger.read(ManagementLogger.java:114) at org.janusgraph.diskstorage.log.util.ProcessMessageJob.run(ProcessMessageJob.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.ExecutionException: java.lang.InterruptedException at io.vavr.concurrent.FutureImpl$1.block(FutureImpl.java:238) at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at io.vavr.concurrent.FutureImpl._await(FutureImpl.java:202) at io.vavr.concurrent.FutureImpl.await(FutureImpl.java:168) at org.janusgraph.diskstorage.cql.function.mutate.CQLExecutorServiceMutateManyUnloggedFunction.mutate(CQLExecutorServiceMutateManyUnloggedFunction.java:59) at org.janusgraph.diskstorage.cql.function.mutate.AbstractCQLMutateManyUnloggedFunction.mutateMany(AbstractCQLMutateManyUnloggedFunction.java:61) ... 35 more Caused by: java.lang.InterruptedException ... 41 more ``` Please note that ERROR log: `1358411 [pool-33-thread-1] ERROR org.janusgraph.diskstorage.log.kcvs.KCVSLog - Reader thread pool for KCVSLog systemlog did not shut down in time - could not clean up or set read markers` I have traced this error into method `KCVSLog.close()`, here are the key lines ``` if (readExecutor!=null) { try { readExecutor.awaitTermination(1,TimeUnit.SECONDS); } catch (InterruptedException e) { log.error("Could not terminate reader thread pool for KCVSLog {} due to interruption", name, e); } if (!readExecutor.isTerminated()) { readExecutor.shutdownNow(); log.error("Reader thread pool for KCVSLog {} did not shut down in time - could not clean up or set read markers", name); } else { for (MessagePuller puller : msgPullers) { puller.close(); } } } ExceptionWrapper exceptionWrapper = new ExceptionWrapper(); ExecuteUtil.executeWithCatching(() -> { try{ writeSetting(manager.senderId, MESSAGE_COUNTER_COLUMN, numMsgCounter.get()); } catch (Throwable e){ log.error("Could not persist message counter [{}] ; message counter [{}]", manager.senderId, numMsgCounter.get(), e); throw e; } }, exceptionWrapper); ``` I found that `readExecutor.shutdownNow()` is trying to terminate itself, this throws an InterruptionException which will cause the `writeSetting` failure. Seems this is a inevitable failure during dropping graph, or am i using the wrong configurations? Hoping your answers~
li-boxuan commented 2 years ago

Not sure if this is related, but have you tried using a higher value (e.g. 30000) for evaluationTimeout? Right now you are using 0 which does not make much sense to me.

baughmann commented 2 years ago

@li-boxuan Per the Gremlin docs (CTFL+F for "evaluationTimeout")

The amount of time in milliseconds before a request evaluation and iteration of result times out. This feature can be turned off by setting the value to 0.

So @liu-chao-simon is disabling the timeout by setting it to 0. He needs to, however, also run :remote config timeout none in his gremlin console in order to disable the timeout on the client end as well. Need to be separately disabled on both.

panda0120 commented 2 days ago

I use hbase backend,and have the same bug, @liu-chao-simon ,how can i fix it?Thank you very much.