OpenCTI-Platform / opencti

Open Cyber Threat Intelligence Platform
https://opencti.io
Other
6.31k stars 932 forks source link

Multiple workers and PermanentBackendException: Permanent failure in storage backend #211

Closed cvdsouza closed 5 years ago

cvdsouza commented 5 years ago

Hi , hoping to get some direction on the error I'm seeing when installing the platform using docker.

Description

{ Please provide a clear and concise description of the bug. } After running docker-compose --compatibility up , while the import/exports are running , I get the following error :

opencti_1            | error: undefined {"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not execute operation due to backend exception. Please check server logs for the stack trace."}
opencti_1            | error: undefined {"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not execute operation due to backend exception. Please check server logs for the stack trace."}
opencti_1            | error: undefined {"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not execute operation due to backend exception. Please check server logs for the stack trace."}
opencti_1            | error: undefined {"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not execute operation due to backend exception. Please check server logs for the stack trace."}
opencti_1            | error: 2 UNKNOWN: Could not call index. Please check server logs for the stack trace. {"locations":[{"line":3,"column":16}],"path":["attackPatternAdd"],"extensions":{"code":"INTERNAL_SERVER_ERROR","exception":{"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not call index. Please check server logs for the stack trace.","stacktrace":["Error: 2 UNKNOWN: Could not call index. Please check server logs for the stack trace.","    at Object.exports.createStatusError (/opt/opencti/node_modules/grpc/src/common.js:91:15)","    at ClientDuplexStream._emitStatusIfDone (/opt/opencti/node_modules/grpc/src/client.js:233:26)","    at ClientDuplexStream._receiveStatus (/opt/opencti/node_modules/grpc/src/client.js:211:8)","    at Object.onReceiveStatus (/opt/opencti/node_modules/grpc/src/client_interceptors.js:1306:15)","    at InterceptingListener._callNext (/opt/opencti/node_modules/grpc/src/client_interceptors.js:568:42)","    at InterceptingListener.onReceiveStatus (/opt/opencti/node_modules/grpc/src/client_interceptors.js:618:8)","    at /opt/opencti/node_modules/grpc/src/client_interceptors.js:1123:18"]}}}
opencti_1            | error: undefined {"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not call index. Please check server logs for the stack trace."}
worker-import_8      | ERROR:root:An unknown error has occurred!  Please try again later.
worker-import_8      | ERROR:root:An unexpected error occurred: { 'NoneType' object is not subscriptable }
opencti_1            | error: undefined {"code":2,"metadata":{"_internal_repr":{},"flags":0},"details":"Could not execute operation due to backend exception. Please check server logs for the stack trace."}
worker-import_8      | INFO:root:Message (type=stix2-bundle, delivery_tag=11) processed, thread terminated
worker-import_8      | INFO:root:Processing a new message (type=stix2-bundle, delivery_tag=12), launching a thread...
worker-import_8      | INFO:root:Importing a marking-definition
grakn_1              | 2019-09-07 20:56:11,863 [transaction-listener-0] ERROR g.c.s.r.SessionService$TransactionListener - Runtime Exception in RPC TransactionListener: 
grakn_1              | org.janusgraph.core.JanusGraphException: Could not call index
grakn_1              |  at org.janusgraph.graphdb.util.SubqueryIterator.<init>(SubqueryIterator.java:68)
grakn_1              |  at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3.execute(StandardJanusGraphTx.java:1298)
grakn_1              |  at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3.execute(StandardJanusGraphTx.java:1190)
grakn_1              |  at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:194)
grakn_1              |  at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:68)
grakn_1              |  at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:54)
grakn_1              |  at org.janusgraph.graphdb.query.ResultSetIterator.<init>(ResultSetIterator.java:44)
grakn_1              |  at org.janusgraph.graphdb.query.QueryProcessor.iterator(QueryProcessor.java:66)
grakn_1              |  at com.google.common.collect.Iterables$4.iterator(Iterables.java:578)
grakn_1              |  at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphStep.executeGraphCentryQuery(JanusGraphStep.java:156)
grakn_1              |  at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphStep.lambda$null$1(JanusGraphStep.java:95)
grakn_1              |  at java.lang.Iterable.forEach(Iterable.java:75)
grakn_1              |  at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphStep.lambda$new$2(JanusGraphStep.java:95)
grakn_1              |  at org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:142)
grakn_1              |  at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
grakn_1              |  at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
grakn_1              |  at grakn.core.server.session.TransactionOLTP.getConcept(TransactionOLTP.java:496)
grakn_1              |  at grakn.core.server.session.TransactionOLTP.getConcept(TransactionOLTP.java:491)
grakn_1              |  at grakn.core.server.session.TransactionOLTP.getSchemaConcept(TransactionOLTP.java:772)
grakn_1              |  at grakn.core.server.session.TransactionOLTP.getSchemaConcept(TransactionOLTP.java:764)
grakn_1              |  at grakn.core.server.session.TransactionOLTP.getSchemaConcept(TransactionOLTP.java:814)
grakn_1              |  at grakn.core.graql.reasoner.atom.predicate.IdPredicate.createIdVar(IdPredicate.java:70)
grakn_1              |  at grakn.core.graql.reasoner.atom.predicate.IdPredicate.create(IdPredicate.java:50)
grakn_1              |  at grakn.core.graql.reasoner.utils.ReasonerUtils.getIdPredicate(ReasonerUtils.java:106)
grakn_1              |  at grakn.core.graql.executor.property.IsaExecutor.atomic(IsaExecutor.java:76)
grakn_1              |  at grakn.core.graql.reasoner.atom.AtomicFactory.lambda$createAtoms$0(AtomicFactory.java:56)
grakn_1              |  at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
grakn_1              |  at java.util.Iterator.forEachRemaining(Iterator.java:116)
grakn_1              |  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
grakn_1              |  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
grakn_1              |  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
grakn_1              |  at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
grakn_1              |  at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
grakn_1              |  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
grakn_1              |  at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
grakn_1              |  at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
grakn_1              |  at java.util.Iterator.forEachRemaining(Iterator.java:116)
grakn_1              |  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
grakn_1              |  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
grakn_1              |  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
grakn_1              |  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
grakn_1              |  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
grakn_1              |  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
grakn_1              |  at grakn.core.graql.reasoner.atom.AtomicFactory.createAtoms(AtomicFactory.java:58)
grakn_1              |  at grakn.core.graql.reasoner.query.ReasonerQueryImpl.<init>(ReasonerQueryImpl.java:103)
grakn_1              |  at grakn.core.graql.reasoner.query.ReasonerQueries.createWithoutRoleInference(ReasonerQueries.java:82)
grakn_1              |  at grakn.core.graql.executor.QueryExecutor.lambda$validateClause$6(QueryExecutor.java:154)
grakn_1              |  at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
grakn_1              |  at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
grakn_1              |  at java.util.Iterator.forEachRemaining(Iterator.java:116)
grakn_1              |  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
grakn_1              |  at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
grakn_1              |  at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
grakn_1              |  at java.util.Iterator.forEachRemaining(Iterator.java:116)
grakn_1              |  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
grakn_1              |  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
grakn_1              |  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
grakn_1              |  at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
grakn_1              |  at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
grakn_1              |  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
grakn_1              |  at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
grakn_1              |  at grakn.core.graql.executor.QueryExecutor.validateClause(QueryExecutor.java:154)
grakn_1              |  at grakn.core.graql.executor.QueryExecutor.match(QueryExecutor.java:106)
grakn_1              |  at grakn.core.graql.executor.QueryExecutor.get(QueryExecutor.java:381)
grakn_1              |  at grakn.core.server.session.TransactionOLTP.stream(TransactionOLTP.java:343)
grakn_1              |  at grakn.core.api.Transaction.stream(Transaction.java:270)
grakn_1              |  at grakn.core.server.rpc.SessionService$TransactionListener.query(SessionService.java:328)
grakn_1              |  at grakn.core.server.rpc.SessionService$TransactionListener.handleRequest(SessionService.java:217)
grakn_1              |  at grakn.core.server.rpc.SessionService$TransactionListener.lambda$onNext$1(SessionService.java:175)
grakn_1              |  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
grakn_1              |  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
grakn_1              |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
grakn_1              |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
grakn_1              |  at java.lang.Thread.run(Thread.java:748)
grakn_1              | Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.convertException(CassandraThriftKeyColumnValueStore.java:264)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:159)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:108)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:97)
grakn_1              |  at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
grakn_1              |  at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:399)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:396)
grakn_1              |  at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
grakn_1              |  at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:396)
grakn_1              |  at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:51)
grakn_1              |  at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:512)
grakn_1              |  at org.janusgraph.graphdb.util.SubqueryIterator.<init>(SubqueryIterator.java:66)
grakn_1              |  ... 73 common frames omitted
grakn_1              | Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Cannot assign requested address (connect failed)
grakn_1              |  at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
grakn_1              |  at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeRawConnection(CTConnectionFactory.java:110)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:74)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:43)
grakn_1              |  at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1220)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:144)
grakn_1              |  ... 86 common frames omitted
grakn_1              | Caused by: java.net.ConnectException: Cannot assign requested address (connect failed)
grakn_1              |  at java.net.PlainSocketImpl.socketConnect(Native Method)
grakn_1              |  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
grakn_1              |  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
grakn_1              |  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
grakn_1              |  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
grakn_1              |  at java.net.Socket.connect(Socket.java:589)
grakn_1              |  at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
grakn_1              |  ... 92 common frames omitted
worker-import_3      | INFO:root:Relationships imported in: 1secs
worker-import_3      | INFO:root:Reports imported in: 0secs
worker-import_7      | INFO:root:Objects imported in: 0secs
worker-import_7      | INFO:root:Getting relation relationship--341fa858-427e-4a99-8336-0e2555567afb...
worker-import_3      | INFO:root:Message (type=stix2-bundle, delivery_tag=4) processed, thread terminated
worker-import_3      | INFO:root:Processing a new message (type=stix2-bundle, delivery_tag=5), launching a thread...
worker-import_3      | INFO:root:Marking definitions imported in: 0secs
worker-import_3      | INFO:root:Importing a identity
worker-import_7      | INFO:root:Getting relation relationship--341fa858-427e-4a99-8336-0e2555567afb...
worker-import_3      | INFO:root:Importing a identity
worker-import_7      | INFO:root:Getting relations, from: 5bef2adc-9b97-407b-a6d5-79b2f4162dfb, to: 08a4a54d-c70d-40ab-ba46-84b1d6a9883d...
worker-import_3      | INFO:root:Creating identity Healthcare research...
worker-import_7      | INFO:root:Creating relation gather => part_of...
worker-import_3      | INFO:root:Identities imported in: 1secs
worker-import_3      | INFO:root:Importing a identity
worker-import_7      | INFO:root:Relationships imported in: 2secs
worker-import_7      | INFO:root:Reports imported in: 0secs
worker-import_3      | INFO:root:Importing a identity

The docker compose hasn't stopped, but the error occurs periodically.

Environment

  1. Ubuntu 16.04 LTS
  2. OpenCTI 1.1.2
  3. OpenCTI client: frontend

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. docker-compose --compatibility up

Expected Output

In previous versions, the errors were not observed. It was when I used the recently updated docker-compose file that I received these errors.

Attaching the YAML file ( without passwords ). docker-compose (1).txt

richard-julien commented 5 years ago

It seems to be a grakn problem.

grakn_1              | Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.convertException(CassandraThriftKeyColumnValueStore.java:264)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:159)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:108)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:97)
grakn_1              |  at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
grakn_1              |  at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:399)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:396)
grakn_1              |  at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
grakn_1              |  at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
grakn_1              |  at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:396)
grakn_1              |  at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:51)
grakn_1              |  at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:512)
grakn_1              |  at org.janusgraph.graphdb.util.SubqueryIterator.<init>(SubqueryIterator.java:66)
grakn_1              |  ... 73 common frames omitted
grakn_1              | Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Cannot assign requested address (connect failed)
grakn_1              |  at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
grakn_1              |  at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeRawConnection(CTConnectionFactory.java:110)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:74)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:43)
grakn_1              |  at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1220)
grakn_1              |  at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:144)
grakn_1              |  ... 86 common frames omitted
grakn_1              | Caused by: java.net.ConnectException: Cannot assign requested address (connect failed)
grakn_1              |  at java.net.PlainSocketImpl.socketConnect(Native Method)
grakn_1              |  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
grakn_1              |  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
grakn_1              |  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
grakn_1              |  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
grakn_1              |  at java.net.Socket.connect(Socket.java:589)
grakn_1              |  at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
grakn_1              |  ... 92 common frames omitted

Looking your docker compose do you really have 32Go of memory to run grakn? If yes, can you try to start grakn alone (removing everything else in the docker compose) and see whats happen? You can also try to back to grakn version 1.5.7 and see if the error occurred again?

Thanks.

cvdsouza commented 5 years ago

RE: Memory , yes my VM is : 60CPUs, 120GB RAM , 250GB SSD. I followed your advice: I stood up Grakn alone no issues. I then started adding in my configs into the docker compose. The portion that caused these errors were : worker-import , worker-export replicas which I increased to 10. I did this because I noticed that it's taking over 24 hours for the mitre connector data to be imported into openCTI , so was hoping that by increasing the number of workers I could speed up the process, but it seems grakn starts throwing those errors when I increase the replica numbers.

richard-julien commented 5 years ago

Thanks for the testing. Seems to be a problem in Grakn cache management. I will try to reproduce and open an issue on grakn side.

SamuelHassine commented 5 years ago

@cvdsouza, in the OpenCTI 2.0.0 we will migrate to the new version of Grakn (1.5.9). I let this issue open until the release. I think your problem will be solved with this new release.

SamuelHassine commented 5 years ago

@cvdsouza The migration to Grakn 1.5.9 will be in 2.0.1 (release will be done today).