orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.73k stars 869 forks source link

OrientDb stopped working with script execution error #9957

Open sahirshahs opened 1 year ago

sahirshahs commented 1 year ago

OrientDB Version: 3.2.2 base image in kubernetes container

Java Version: Orientdb 3.2.2 base image running on container

OS: Kubernetes running on Linux

Expected behavior

OrientDb started throwing internal server error, scriptexecutiontimeout error, restart error etc and stopped working. Repair database command not detecting any error. 2 different types of error messages logged in the db console attached here. Please help on this issue. Similar issue came on multiple instance of our testing and production critically. It would be helpful if someone can help on this issue to recover and also to prevent in future.

Actual behavior

## Steps to reproduce Exact steps to reproduce the issue is not clear because it started suddenly. The below are the error logged on the console in 2 different times. --------------1st Error------------------------- ``` 2023-03-21 12:10:56:477 WARNI Execution of thread 'Thread[gremlin-server-exec-1,5,main]' is interrupted [OStorageInterruptionManager]Script evaluation exceeded the configured threshold for request [RequestMessage{, requestId=070f0ad8-b257-43a1-971f-052521b00a94, op='eval', processor='', args={gremlin=g.V().hasLabel('object_model').as('m').has('s_model','abb.controlSystem.800xA.aspectObject').has('s_objectId','9dcff07a-c45b-483d-9db6-209d2282a015')}}] java.util.concurrent.TimeoutException: Evaluation exceeded the configured 'evaluationTimeout' threshold of 6000000 ms or evaluation was otherwise cancelled directly for request [g.V().hasLabel('object_model').as('m').has('s_model','abb.controlSystem.800xA.aspectObject').has('s_objectId','9dcff07a-c45b-483d-9db6-209d2282a015')] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$1(GremlinExecutor.java:316) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.lang.Thread.run(Thread.java:748) ``` ------------------------2nd Error----------------------------- ``` com.orientechnologies.orient.core.exception.OStorageException: Internal error happened in storage abcodb please restart the server or re-open the storage to undergo the restore process and fix the error. DB name="abcodb" at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkErrorState(OAbstractPaginatedStorage.java:4864) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkOpennessAndMigration(OAbstractPaginatedStorage.java:4848) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.getClusterNames(OAbstractPaginatedStorage.java:2131) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentAbstract.getClusterNames(ODatabaseDocumentAbstract.java:721) at com.orientechnologies.orient.core.sql.executor.OSelectExecutionPlanner.calculateShardingStrategy(OSelectExecutionPlanner.java:262) at com.orientechnologies.orient.core.sql.executor.OSelectExecutionPlanner.createExecutionPlan(OSelectExecutionPlanner.java:115) at com.orientechnologies.orient.core.sql.parser.OSelectStatement.createExecutionPlan(OSelectStatement.java:302) at com.orientechnologies.orient.core.sql.parser.OSelectStatement.execute(OSelectStatement.java:291) at com.orientechnologies.orient.core.sql.parser.OStatement.execute(OStatement.java:81) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentEmbedded.query(ODatabaseDocumentEmbedded.java:641) at org.apache.tinkerpop.gremlin.orientdb.OrientGraph.querySql(OrientGraph.java:261) at org.apache.tinkerpop.gremlin.orientdb.OrientStandardGraph.querySql(OrientStandardGraph.java:192) at org.apache.tinkerpop.gremlin.orientdb.OrientGraphQuery.execute(OrientGraphQuery.java:31) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.lambda$elements$6(OrientGraphStep.java:89) at java.util.Optional.map(Optional.java:215) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.elements(OrientGraphStep.java:87) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.vertices(OrientGraphStep.java:51) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.lambda$new$0(OrientGraphStep.java:43) at org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:157) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197) at org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.handleIterator(AbstractOpProcessor.java:93) at org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor.lambda$evalOpInternal$5(AbstractEvalOpProcessor.java:264) at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:278) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.orientechnologies.orient.core.exception.OStorageException: Internal error happened in storage abcodb please restart the server or re-open the storage to undergo the restore process and fix the error. DB name="abcodb" ... 30 more Caused by: com.orientechnologies.orient.core.exception.OStorageException: Internal error happened in storage abcodb please restart the server or re-open the storage to undergo the restore process and fix the error. DB name="abcodb" ... 30 more Caused by: com.orientechnologies.orient.core.exception.OStorageException: Internal error happened in storage abcodb please restart the server or re-open the storage to undergo the restore process and fix the error. DB name="abcodb" ... 30 more Caused by: com.orientechnologies.orient.core.exception.OStorageException: Storage abcodb is not opened. DB name="abcodb" at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkOpennessAndMigration(OAbstractPaginatedStorage.java:4858) ... 28 more Exception processing a script on request [RequestMessage{, requestId=fe1c824f-f10c-4e22-abca-9943fe047eab, op='eval', processor='', args={gremlin=g.V().hasLabel('object_model').has('s_objectId', 'ae9277c7-45a6-4468-8ddf-0a77343c3cfd').has('s_model','abc.iom.workflowManager.ModelS107314')}}]. com.orientechnologies.orient.core.exception.OStorageException: Storage abcodb is not opened. DB name="abcodb" at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkOpennessAndMigration(OAbstractPaginatedStorage.java:4858) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.getClusterNames(OAbstractPaginatedStorage.java:2131) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentAbstract.getClusterNames(ODatabaseDocumentAbstract.java:721) at com.orientechnologies.orient.core.sql.executor.OSelectExecutionPlanner.calculateShardingStrategy(OSelectExecutionPlanner.java:262) at com.orientechnologies.orient.core.sql.executor.OSelectExecutionPlanner.createExecutionPlan(OSelectExecutionPlanner.java:115) at com.orientechnologies.orient.core.sql.parser.OSelectStatement.createExecutionPlan(OSelectStatement.java:302) at com.orientechnologies.orient.core.sql.parser.OSelectStatement.execute(OSelectStatement.java:291) at com.orientechnologies.orient.core.sql.parser.OStatement.execute(OStatement.java:81) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentEmbedded.query(ODatabaseDocumentEmbedded.java:641) at org.apache.tinkerpop.gremlin.orientdb.OrientGraph.querySql(OrientGraph.java:261) at org.apache.tinkerpop.gremlin.orientdb.OrientStandardGraph.querySql(OrientStandardGraph.java:192) at org.apache.tinkerpop.gremlin.orientdb.OrientGraphQuery.execute(OrientGraphQuery.java:31) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.lambda$elements$6(OrientGraphStep.java:89) at java.util.Optional.map(Optional.java:215) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.elements(OrientGraphStep.java:87) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.vertices(OrientGraphStep.java:51) at org.apache.tinkerpop.gremlin.orientdb.traversal.step.sideeffect.OrientGraphStep.lambda$new$0(OrientGraphStep.java:43) at org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:157) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197) at org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.handleIterator(AbstractOpProcessor.java:93) at org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor.lambda$evalOpInternal$5(AbstractEvalOpProcessor.java:264) at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:278) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ```
tglman commented 1 year ago

Hi,

This specific error is a "downstream" error of something else happened earlier, usually a restart of the server/node is enough to clean up this error, if nothing else major has happened, the 3.2.2 version is quite back in time now, we do suggest to update to the last hotfix ( at time of writing 3.2.17), there have been quite a lot of fixes in the meanwhile that avoid this error to happen with recent versions.

So feel free to update to the last hotfix and report if you still experience similar problems.

Regards