Closed EricSchreiner closed 7 years ago
Hi @EricSchreiner what is your disk cache size?
Hi @laa these are the parameters we use. Everything else is default
Map defaultsMap=new HashMap<String, Object>();
defaultsMap.put("storage.keepOpen", false); // Tells to the engine to not close the storage when a database is closed. Storages will be closed when the process will shutdown
defaultsMap.put("tx.useLog", true); // Transactions use log file to store temporary data to being rolled back in case of crash
defaultsMap.put("tx.log.synch", true); // Executes a synch against the file-system for each log entry. This slows down transactions but guarantee transaction reliability on non-reliable drives
defaultsMap.put("tx.commit.synch", true); // Synchronizes the storage after transaction commit (see Disable the disk synch)
defaultsMap.put("cache.level1.enabled", false);
defaultsMap.put("cache.level1.size", 0);
// ES removed Feb 2015 seit ODB 2.0.0 nicht mehr nötig defaultsMap.put("cache.level2.enabled", false);
// ES removed Feb 2015 seit ODB 2.0.0 nicht mehr nötig defaultsMap.put("cache.level2.size", 0);
defaultsMap.put("nonTX.recordUpdate.synch", true); // Executes a synch against the file-system at every record operation. This slows down records updates but guarantee reliability on unreliable drives
defaultsMap.put("index.auto.rebuildAfterNotSoftClose", true); // Auto rebuild all automatic indexes after upon database open when wasn't closed properly
defaultsMap.put("mvrbtree.lazyUpdates", 1); // -1=Auto, 0=always lazy until explicit lazySave() is called by application, 1=No lazy, commit at each change. >1=Commit at every X changes
OGlobalConfiguration.setConfiguration(defaultsMap);
@EricSchreiner I suppose it means that you have 4GB disk cache. Which is above of capabilities of 32 JVM. Also, I strongly do not recommend to disable first level cache
About your settings:
defaultsMap.put("cache.level1.enabled", false);
defaultsMap.put("cache.level1.size", 0)
it may cause a lot of strange exceptions in your application.
defaultsMap.put("mvrbtree.lazyUpdates", 1);
mvrbtree is removed long time ago from distribution and this parameter is not needed.
defaultsMap.put("nonTX.recordUpdate.synch", true);
defaultsMap.put("tx.commit.synch", true);
is the legacy of 1.x version of the implementation of txs and not used any more. So you can remove them too.
defaultsMap.put("tx.useLog", true);
is always true and can not be changed, even if you directly set it to false. So this parameter can be removed too.
defaultsMap.put("storage.keepOpen", false); // Tells to the engine to not close the storage when a database is closed. Storages will be closed when the process will shutdown
is not valid anymore, this parameter is always true and can not be changed. The same for defaultsMap.put("tx.log.synch", true)
. It is always true and can not be changed so you can remove it from a map.
Back to your main issue. I suggest you set com.orientechnologies.orient.core.config.OGlobalConfiguration#DISK_CACHE_SIZE
to 800 (it means 800 MB) keep -XX:MaxDirectMemorySize=1G
and I do suggest you set DISK_CACHE_SIZE
parameter directly not through OGlobalConfiguration.setConfiguration(defaultsMap) call.
P.S. BTW what is your expected DB size, according to tests which you already performed do you already have some expectations. By DB size, I mean size on disk in GBs or MBs?
Hi @laa
thanks for your reply. Does your answer mean that I should remove all settings with the exception of defaultsMap.put("index.auto.rebuildAfterNotSoftClose", true);
?
You wrote I should reduce DISK_CACHE_SIZE to 800m. Is this related to XX:MaxDirectMemorySize? If yes how? Can I set DISK_CACHE_SIZE and XX:MaxDirectMemorySize to 128mb?
For your understanding: We have thousands of users runnig PicApport not having the XX:MaxDirectMemorySize set. Lots of them are using a RaspberyPI with just one Gig of physical memory. So in the past we recommend to set -Xmx512m for 32 BIt Installation and RaspberryPI what works fine with serveral thousand photos(we tested with 6000). What i like to achieve is that this will still work with our new version with Orient 2.2.xx because I expect a lot of our users will not read our release notes. (We also have created a .exe file with a Windows-Installer for complete unexperienced users who I cannot ask to set any parameter) And again I do not care about speed in these low memory situations it shoud just work.
To answer your questions. My test database contains about 50.000 Photos (metadata and thumbnails) . The total size of the database directory is 880mb dbconfig.txt
I also have a test system with one million Photos (for this we have a 64 Bit engine but I have not tried it yet with V2.2.xx)
@EricSchreiner I see I suppose I can help you to run a database without -xx:maxdirectmemory set. But now we are busy. I will be back to this issue on next Tuesday.
OK
Hi @EricSchreiner,
There are basically only three options that affect/limit the memory usage of OrientDB:
-Xmx
limits the heap size, as we all know. Usually, if not provided, it's auto-configured by JVM to some reasonable default. May be configured only from the JVM args, OrientDB can't control it.
-XX:MaxDirectMemorySize
limits the amount of the off-heap "direct" memory JVM may allocate. Usually, if not provided, it's auto-configured by JVM to the value of -Xmx
. May be configured only from the JVM args, OrientDB can't control it.
-Dstorage.diskCache.bufferSize
, aka OGlobalConfiguration#DISK_CACHE_SIZE
, limits the disk cache size of OrientDB. Auto-configured by OrientDB to the value of Xmx
if XX:MaxDirectMemorySize
is not provided, otherwise it's configured to max(machine_memory_size - Xmx - 2GB, 256MB)
and upper-limited to the value of XX:MaxDirectMemorySize
. Minimum supported value is 64MB. Note, that is not a hard limit, if the disk cache is full and non of its memory can be freed, the so called small overflow buffers will be allocated. Setting the disk cache size to extremely low values while performing huge queries will not help, especially in case of update/insert queries.
The disk cache allocates memory from JVM's off-heap "direct" memory. So to avoid OOMs DISK_CACHE_SIZE <= XX:MaxDirectMemorySize
inequality must always hold and Xmx + DISK_CACHE_SIZE + memory_reserved_by_os_and_other_processes <= machine_memory_size
must also hold.
Regarding your test box with less than 2GB of RAM mentioned in the emails. Try to set Xmx
to 512MB and remove all other options. -XX:MaxDirectMemorySize
will be auto-configured to 512MB by JVM, DISK_CACHE_SIZE
will be auto-configured to 512MB by OrientDB. Total memory consumption of OrientDB must be around 1GB, that should leave enough RAM to the OS and other processes. But still it's better to have -XX:MaxDirectMemorySize
and DISK_CACHE_SIZE
set to explicit values according the the aforementioned inequalities.
In case of 1GB RaspberyPI box with already configured Xmx
of 512MB and neither of -XX:MaxDirectMemorySize
or DISK_CACHE_SIZE
set, this means that OrientDB may eat up to 1GB of RAM. That is too much for the box. I may tune the DISK_CACHE_SIZE
auto-configuration procedure to adjust for low memory conditions, but there still will be a problem if Xmx
set so high that there is no RAM left the disk cache. What is the typical Xmx
of your RaspberyPI users?
Hi @taburet, thanks for your answer. I'll check and come back to you. In between: Is it possible that _max(machine_memorysize - Xmx - 2GB, 256MB) does not work if we execute OrientDB in a 32-Bit VM on a Computer that has more than 16gig of RAM? What would _machine_memorysize be in a 32-Bit Environment with a PC with 16 Gig of Ram? Are you using: _machine_memorysize = os.getTotalPhysicalMemorySize();
Is it possible that max(machine_memory_size - Xmx - 2GB, 256MB) does not work if we execute OrientDB in a 32-Bit VM on a Computer that has more than 16gig of RAM?
It should work, AFAIU, but in a wrong way :) Why it may behave differently specifically at 16GB?
What would machine_memory_size be in a 32-Bit Environment with a PC with 16 Gig of Ram?
Seems like it will be 16GB and that may be a problem. Will check this.
Are you using: machine_memory_size = os.getTotalPhysicalMemorySize();
Yes, exactly.
@EricSchreiner did you see messages like "32 bit JVM is detected. Lowering disk cache size from X to Y" in the logs?
Hi @taburet no we don't see messages like 32 Bit JVM detected. I've attached a logfile that contains a configuration dump PicApport-32Bit.txt
@taburet one more thing about why using 32-Bit on a machine with 16 Gig of Ram. Well this is our Test-environment. Also my laptop I use for testing has 32 gig of RAM and I also need to test installations I have received from users .
@EricSchreiner yes, I understand your needs. The strange thing is that according to the provided log file there is no auto-configuration done on OrientDB side at all, but it must be done, sine disk cache is not configured. I will investigate more on this.
Hi @EricSchreiner could you try this build https://drive.google.com/file/d/0B2oZq2xVp841eklKTmVLMW1kMTQ/view?usp=sharing
Hi @EricSchreiner please do not set MaxDirectMemory but only heap size, so we will test whether your requirements are satisfied.
Hi @laa, Hi @taburet
Still not working. I've removed the MaxDirectMemory Parameter. Please see logfile below
@EricSchreiner According to stack trace which you sent
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) EXCEP@ at com.orientechnologies.common.directmemory.OByteBufferPool.allocateBuffer(OByteBufferPool.java:335)
the exception happens at line 335 of byte buffer pool, But in the file after the changes which I did this line corresponds to code https://github.com/orientechnologies/orientdb/blob/fed5276ae99462665abe7b2ffed00cedd904a58b/core/src/main/java/com/orientechnologies/common/directmemory/OByteBufferPool.java#L335
if (clear) {
which obviously can not cause OOM in byte buffer. Which means that you used out of dated version.
How did you get 2.2.26 distribution? Did you download it from the link which I posted to you?
Hi @laa I used the link you provided: https://drive.google.com/file/d/0B2oZq2xVp841eklKTmVLMW1kMTQ/view?usp=sharing as you can see in the logfile it should be the correct one....
DEBUG@ 11:11:23.231 PicApportDBService.setDbConfig: ----- start dump db-configuration ----- OrientDB 2.2.26-SNAPSHOT (build e48ae34ce1827858f78f9f4ddfe30fd289050478) configuration dump
HI @EricSchreiner that is my fault then, could you try this build https://drive.google.com/file/d/0B2oZq2xVp841THlxbVhhemxMMGM/view?usp=sharing . Could also set log level to the info level or lower so we will see all information printed in the log by ODB.
Hi @laa still not working (see attached logfile) The build number is different from the previous one... rientDB 2.2.26-SNAPSHOT (build 1083c79e63810dbafc9fee07f24654b22a5b7e65) I've also set -Dlog.console.level=INFO but it seems not to work???? PicApport-odb2.2.26.txt
@EricSchreiner could you add the following parameter to the command line -Djava.util.logging.config.file=<path to file>
this file should be similar to the following https://github.com/orientechnologies/orientdb/blob/2.2.x/server/config/orientdb-server-log.properties and send me the log output?
Hi @laa Find attached the logfile created: orient-server.log.txt
@EricSchreiner could you try new build https://drive.google.com/file/d/0B2oZq2xVp841THlxbVhhemxMMGM/view?usp=sharing please send me a log output as you did during the previous run.
HI @laa see attached Logfile orient-server.log.txt
The Excepions ther should be aleady fixed for 2.2.25 ?!? please see https://github.com/orientechnologies/orientdb/issues/7585#issuecomment-318350805
@EricSchreiner I used the latest version of source code when provided this build. Which means that exception is not fixed I suppose. But issue was created about different exception and my code does not touch this part, probably once OOM issue was fixed exception started to be reproduced. Could you modify your test to run execution operations without Lucene index ? Or just drop it for a while ? It will make queries slower but it will allow us to check absence of OOM.
HI @laa sorry but getting rid of Lucene is almost impossible. Anyway I'm almost 100% sure that the issue with Lucene was fixed. (I've tested it myself)
@EricSchreiner as I wrote I made a distribution from latest source code. @luigidellaquila @robfrank could you look my commits in code and confirm that they do not affect Lucene functionality and it also means that we, unfortunately, have to reopen the issue.
As of the moment, I see that issue is blocked by Lucene exception, unfortunately, we can not make progress on this issue till it will not be fixed. Once we resolve this problem we may continue.
About lucene exceprtion, can you try with latest 2.2.26-sNAPSHOT?
@orientdb-builder I included fix which I provided a few minutes ago in the main branch. Could we re-run build and get the latest snapshot from source code as for now?
@laa Now the memory error is back again..... orient-server.log.txt
@EricSchreiner in the comment above I asked @orientdb-builder to run the build again to include latest changes from source code. Once new snapshot will be available me or him will provide a link for you. Build which you tried that is build which I provided before my latest build. I suppose today or tomorrow build from latest source code will be provided.
@EricSchreiner latest snapshot is generated https://oss.sonatype.org/content/repositories/snapshots/com/orientechnologies/orientdb-community/2.2.26-SNAPSHOT/orientdb-community-2.2.26-20170810.153209-13.tar.gz could you try it and send the log as usual :-)
HI @laa
see logfile.... orient-server.log.3.txt
Hi @EricSchreiner @robfrank as I can see Lucene exception was reproduced the same as in issue #7585 which @EricSchreiner is referenced. I will mark this issue as blocked till issue #7585 will be resolved. Actually I suppose that OOM is fixed and it allows to reproduce Lucene issue but we need to be 100% sure.
Hi @laa , @robfrank is it possible that I need another orientdb-spatial-2.2.23-dist.jar? If yes where will I get the orientdb-spatial-2.2.26-dist.jar?
@EricSchreiner that is very likely could you try this one https://oss.sonatype.org/content/repositories/snapshots/com/orientechnologies/orientdb-spatial/2.2.26-SNAPSHOT/orientdb-spatial-2.2.26-20170810.160653-15.jar ?
@EricSchreiner the problem referenced in #7585 is solved from 2.2.25. I supposed you updated to latest 2.2.25. So please take the latest snapshot of spatial as well.
@laa now the log with orientdb-spatial-2.2.26-20170810.160653-15.jar orient-server.log.2.txt
So Lucene issue still persist . Will wait for fix .
Hi @laa, hi @robfrank any news on this?
Hi @laa, hi @robfrank, I've tested 2.2.26 GA. looks much better :-) I'll continue testing tomorrow Logfile: orient-server.log.txt Config: PicApport-2.2.26.txt
@EricSchreiner ok so probably it was just an issue with a mix of libraries of different versions, I am waiting for your final conclusion.
Hi @EricSchreiner any update on this?
Hi @laa seems to be OK so far. I've attached two logfiles from the same database started with 32Bit and 64Bit. The only thing I see is, that sometimes it takes a very long time to shutdown the database. This seem to be new. orient-server-64bit.log.0.txt orient-server-32bit.log.0.txt
@EricSchreiner cool. What do you mean by takes too long time to shut down? Does it take on both instances or on 32 bit only?
@EricSchreiner I will close this issue because seems like it is fixed. But please open a new issue if you think something is wrong with the shutdown, may be it is a bug may be not let see. If you will be able to create profiler snapshot it will be cool if not we will provide instructions for very good and free one, but of course without handy GUI.
@santo-it for release notes: "On 32 bit systems because the high level of memory fragmentation ODB can not allocate memory by big chunks, so it always allocates memory with page-size granularity. It will decrease performance but will avoid throwing of OOM in case of allocation of direct memory".
@laa thank you for your support......
OrientDB Version: 2.2.23
Java Version: 1.8.0_131 32Bit
OS: Windows 10
Hi @lvca ,
When I use your recomended settings with a 32 Bit (1.8.0_131) Runtime I get the out of memory immmediately (with -XX:MaxDirectMemorySize=128m it just comes later)
Here are the relevant settings: VER @ 10:42:50.358 java.runtime.version: 1.8.0_131-b11 VER @ 10:42:50.358 java.version: 1.8.0_131 VER @ 10:42:50.358 java.vm.version: 25.131-b11 VER @ 10:42:50.358 java.vm.vendor: Oracle Corporation VER @ 10:42:50.358 java.vm.name: Java HotSpot(TM) Client VM VER @ 10:42:50.358 java.specification.version: 1.8 VER @ 10:42:50.358 java.vm.specification.version: 1.8 VER @ 10:42:50.359 os.name: Windows 10 VER @ 10:42:50.359 os.version: 10.0 VER @ 10:42:50.359 os.arch: x86 MSG @ 10:42:50.359 java.runtime totalMemory=16mb maxMemory=1037mb freeMemory=11mb processors=8 MSG @ 10:42:50.361 java.runtime.argument: -Xmx1024m MSG @ 10:42:50.361 java.runtime.argument: -XX:MaxDirectMemorySize=1G MSG @ 10:42:50.361 java.runtime.argument: -Dpicapport.home=C:\ProgramData\Contecon MSG @ 10:42:50.361 java.runtime.argument: -DTRACE=DEBUG
Here is the error during the start database MSG @ 10:42:52.372 PicApportDBService.createDatabaseDirectory: C:\Users\Eric.picapport\db MSG @ 10:42:52.373 PicApportDBService.startDatabase:plocal:C:/Users/Eric/.picapport/db/db.2.2.23 EXCEP@ ============================================================ EXCEP@ Exception at: 2017-07-26 10:42:52 EXCEP@ Msg: EXCEP@ null EXCEP@ ------------------------------------------------------------ EXCEP@ java.lang.OutOfMemoryError EXCEP@ at sun.misc.Unsafe.allocateMemory(Native Method) EXCEP@ at java.nio.DirectByteBuffer.(DirectByteBuffer.java:127)
EXCEP@ at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
EXCEP@ at com.orientechnologies.common.directmemory.OByteBufferPool.allocateBuffer(OByteBufferPool.java:328)
EXCEP@ at com.orientechnologies.common.directmemory.OByteBufferPool.acquireDirect(OByteBufferPool.java:279)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.load(OWOWCache.java:769)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.updateCache(O2QCache.java:1107)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.doLoad(O2QCache.java:346)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.allocateNewPage(O2QCache.java:397)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperation.commitChanges(OAtomicOperation.java:434)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:468)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:412)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.endAtomicOperation(ODurableComponent.java:116)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.create(OPaginatedCluster.java:195)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.addClusterInternal(OAbstractPaginatedStorage.java:4136)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doAddCluster(OAbstractPaginatedStorage.java:4117)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.create(OAbstractPaginatedStorage.java:459)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.create(OLocalPaginatedStorage.java:127)
EXCEP@ at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.create(ODatabaseDocumentTx.java:438)
EXCEP@ at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.create(ODatabaseDocumentTx.java:398)
EXCEP@ at de.contecon.picapport.db.PicApportDBService.createDBSchema(Unknown Source)
EXCEP@ at de.contecon.picapport.db.PicApportDBService.startDatabase(Unknown Source)
EXCEP@ at de.contecon.picapport.db.PicApportDBService.startDatabase(Unknown Source)
EXCEP@ at de.contecon.picapport.PicApport.startDatabase(Unknown Source)
EXCEP@ at de.contecon.picapport.PicApport.init(Unknown Source)
EXCEP@ at de.contecon.picapport.PicApport.main(Unknown Source)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
EXCEP@ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
EXCEP@ at java.lang.reflect.Method.invoke(Method.java:498)
EXCEP@ at com.sun.javafx.application.LauncherImpl.launchApplicationWithArgs(LauncherImpl.java:389)
EXCEP@ at com.sun.javafx.application.LauncherImpl.launchApplication(LauncherImpl.java:328)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
EXCEP@ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
EXCEP@ at java.lang.reflect.Method.invoke(Method.java:498)
EXCEP@ at sun.launcher.LauncherHelper$FXHelper.main(LauncherHelper.java:767)