orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.75k stars 871 forks source link

Orient reading causing slowness in distributed mode. #8986

Closed rbaarathi closed 4 years ago

rbaarathi commented 5 years ago

OrientDB - 2.2.20
JAVA 8 Centos - 7

Expected behavior : Orient database reading need to be as same as the starting day of application server. It is getting degraded at the end of each day after server start.

Actual behavior :

              We are running our software application in distributed mode and we are using orient document and binary database to store some of our data related to our software.

After 2 or 3 days of server running, while fetching data from orient it is creating some different stack. This causing slowness in reading and we don't find such stack during initial days.

Can u please suggest on this..

During fresh restart, orient database reading is good and no slowness observed. After 3 or 4 running days of server, it is creating such new behavior that is not easy to understand.

Jstack of orient reading thread is below :

"PlugInvokationTask-190801080232192319241-19080108023298132874-CircuitPlug" #1341 prio=5 os_prio=0 tid=0x00002b01b800b000 nid=0xa0ea runnable [0x00002b050a0a1000] java.lang.Thread.State: RUNNABLE at java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444) at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419) at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298) at java.lang.ThreadLocal.get(ThreadLocal.java:163) at com.orientechnologies.orient.core.db.ODatabaseRecordThreadLocal.getIfDefined(ODatabaseRecordThreadLocal.java:77) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.checkIfActive(ODatabaseDocumentTx.java:3454) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1952) at com.orientechnologies.orient.core.tx.OTransactionNoTx.loadRecord(OTransactionNoTx.java:106) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1729) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:102) at com.orientechnologies.orient.core.id.ORecordId.getRecord(ORecordId.java:329) at in.co.nmsworks.documenttdb.db.DocDB.getDocument(DocDB.java:300) at in.co.nmsworks.documenttdb.db.DocDB.getDocTypeObject(DocDB.java:244) at in.co.nmsworks.documenttdb.db.DocDB.getDocTypeObject(DocDB.java:239) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.InvReader.getEntity(InvReader.java:159) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.InvReader.getEntity(InvReader.java:135) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.dataprovider.OrientDataProvider.getEntity(OrientDataProvider.java:83) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.getEntitiesMap(SDHRcaInvReader.java:1087) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.addEntities(SDHRcaInvReader.java:303) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.buildImpactAnalysisDataHolder(SDHRcaInvReader.java:826) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.buildImpactAnalysisDataHolder(SDHRcaInvReader.java:706) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impact.node.constructor.impl2.DefaultImpactAnalysisDataBuilder.buildImpactAnalysisData(DefaultImpactAnalysisDataBuilder.java:42) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impl2.CircuitPlugHelper.getImpactAnalysisDataHolder(CircuitPlugHelper.java:162) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impl2.CircuitPlug.getImpactAnalysisDataHolder(CircuitPlug.java:402) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impl2.CircuitPlug.analyze(CircuitPlug.java:276) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.Plug.compute(Plug.java:564) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.Plug.doComputeForSingleEvent(Plug.java:236) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.Plug.doComputeForEvent(Plug.java:204) at in.co.nmsworks.cygnet.telecom.fault.rcasia.server.ComputePlugJob.doWork(ComputePlugJob.java:96) at com.nmsworks.cygnet.util.treejob.TreeJob.run(TreeJob.java:49) at com.nmsworks.cygnet.util.orderingexecutor.MultiKeyOrderingExecutor$DependentJob.run(MultiKeyOrderingExecutor.java:408) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

In thread local, it is creating entryafterMiss().

The definition of entryafterMiss() method in Thread Local is given as

/**

Why orient reading stack is going in this manner after some days..??

Kindly suggest to resolve this issue. Please post here for any clarification or doubts.

rbaarathi commented 5 years ago

One more jstack in addition to the above issue :

"PlugInvokationTask-190801080232191867596-19080108023297868926-CircuitPlug" #2022 prio=5 os_prio=0 tid=0x00002af32001b000 nid=0xa395 runnable [0x00002b054d928000] java.lang.Thread.State: RUNNABLE at java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444) at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419) at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298) at java.lang.ThreadLocal.get(ThreadLocal.java:163) at com.orientechnologies.orient.core.storage.cache.OCachePointer.getSharedBuffer(OCachePointer.java:220) at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurablePage.deserializeFromDirectMemory(ODurablePage.java:161) at com.orientechnologies.orient.core.index.hashindex.local.OHashIndexBucket.getEntry(OHashIndexBucket.java:137) at com.orientechnologies.orient.core.index.hashindex.local.OHashIndexBucket.find(OHashIndexBucket.java:92) at com.orientechnologies.orient.core.index.hashindex.local.OLocalHashTable.get(OLocalHashTable.java:369) at com.orientechnologies.orient.core.index.engine.OHashTableIndexEngine.get(OHashTableIndexEngine.java:145) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doGetIndexValue(OAbstractPaginatedStorage.java:1764) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.getIndexValue(OAbstractPaginatedStorage.java:1753) at com.orientechnologies.orient.core.index.OIndexOneValue.get(OIndexOneValue.java:58) at com.orientechnologies.orient.core.index.OIndexOneValue.get(OIndexOneValue.java:40) at com.orientechnologies.orient.core.index.OIndexAbstractDelegate.get(OIndexAbstractDelegate.java:58) at com.orientechnologies.orient.core.index.OIndexTxAwareOneValue.get(OIndexTxAwareOneValue.java:262) at com.orientechnologies.orient.core.index.OIndexTxAwareOneValue.get(OIndexTxAwareOneValue.java:40) at in.co.nmsworks.documenttdb.db.DocDB.getDocument(DocDB.java:295) at in.co.nmsworks.documenttdb.db.DocDB.getDocTypeObject(DocDB.java:244) at in.co.nmsworks.documenttdb.db.DocDB.getDocTypeObject(DocDB.java:239) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.InvReader.getEntity(InvReader.java:159) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.InvReader.getEntity(InvReader.java:135) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.dataprovider.OrientDataProvider.getEntity(OrientDataProvider.java:83) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.getEntitiesMap(SDHRcaInvReader.java:1087) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.addPortionEntities(SDHRcaInvReader.java:320) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.addEntities(SDHRcaInvReader.java:306) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.buildImpactAnalysisDataHolder(SDHRcaInvReader.java:826) at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.SDHRcaInvReader.buildImpactAnalysisDataHolder(SDHRcaInvReader.java:706) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impact.node.constructor.impl2.DefaultImpactAnalysisDataBuilder.buildImpactAnalysisData(DefaultImpactAnalysisDataBuilder.java:42) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impl2.CircuitPlugHelper.getImpactAnalysisDataHolder(CircuitPlugHelper.java:162) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impl2.CircuitPlug.getImpactAnalysisDataHolder(CircuitPlug.java:402) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.impl.circuit.impl2.CircuitPlug.analyze(CircuitPlug.java:276) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.Plug.compute(Plug.java:564) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.Plug.doComputeForSingleEvent(Plug.java:236) at in.co.nmsworks.cygnet.telecom.fault.rcasia.plug.Plug.doComputeForEvent(Plug.java:204) at in.co.nmsworks.cygnet.telecom.fault.rcasia.server.ComputePlugJob.doWork(ComputePlugJob.java:96) at com.nmsworks.cygnet.util.treejob.TreeJob.run(TreeJob.java:49) at com.nmsworks.cygnet.util.orderingexecutor.MultiKeyOrderingExecutor$DependentJob.run(MultiKeyOrderingExecutor.java:408) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

andrii0lomakin commented 5 years ago

Hi @rbaarathi , the thread-local hash map is polluted. Do you close the database instance after you use it? Do you use a connection pool? BTW please use latest 2.2.x there are some fixes which remove stale references from the thread-local map.

rbaarathi commented 5 years ago

Yes @laa we are creating database instance in auto closable try block only.

We will surely move to upgraded orient 2.2.x.. Now tell me how to solve this issue now..? It is not possible for us to upgrade the orient version immediately in production environment..

Can u suggest any solution in order to resolve this issue temporarily..??

rbaarathi commented 5 years ago

Also how is database instance related to thread local Map..?

rbaarathi commented 5 years ago

@laa Please Suggest..

rbaarathi commented 5 years ago

Can u please suggest here..

tglman commented 4 years ago

Hi,

OrientDB 2.2.x is out of support since a while, closing this.

Regards