Okay I think I know what's going on with the rocksdb memory now. The native rocksdb C++ objects are destroyed via the JNI interface in the java finalize() method, which is called when the object is garbage collected. However, the C++ object and the amount of process memory it is using is not visible to the JVM (it doesn't even show up when using -XX:NativeMemoryTracking), so the JVM will eventually clean up the native C++ objects, but it doesn't know when it needs to. If there are a lot of those objects being created, but not creating a lot of garbage in the JVM itself, the JVM doesn't know it is rapidly using memory beyond its heap limit and should run the GC. So the JVM will happily keep running and allocating objects thinking it's still below the memory limit, even if it's not.
This generates a few MB of JVM garbage, but several GB of native memory allocations:
And those native objects sit in memory for a while because the JVM doesn't think it needs to run GC (it will eventually happen, but maybe not for a while)
Adding an explicit .close calls the cleanup on the C++ object. So this destroys the objects immediately. The JVM object remains and will be GC'ed later, but the C++ object is already freed, so this issue doesn't happen as much. Technically the not-yet-closed RocksDB heap objects are still using a little bit more memory than the JVM is aware of, but if there's enough free memory, this isn't a problem.
@theferrit32 definitely want a status on this. I think you've fixed it, but I didn't read the whole thread. In any case, let's make sure this is up to date and categorized properly.
Okay I think I know what's going on with the rocksdb memory now. The native rocksdb C++ objects are destroyed via the JNI interface in the java finalize() method, which is called when the object is garbage collected. However, the C++ object and the amount of process memory it is using is not visible to the JVM (it doesn't even show up when using -XX:NativeMemoryTracking), so the JVM will eventually clean up the native C++ objects, but it doesn't know when it needs to. If there are a lot of those objects being created, but not creating a lot of garbage in the JVM itself, the JVM doesn't know it is rapidly using memory beyond its heap limit and should run the GC. So the JVM will happily keep running and allocating objects thinking it's still below the memory limit, even if it's not.
This generates a few MB of JVM garbage, but several GB of native memory allocations:
And those native objects sit in memory for a while because the JVM doesn't think it needs to run GC (it will eventually happen, but maybe not for a while)
Adding an explicit .close calls the cleanup on the C++ object. So this destroys the objects immediately. The JVM object remains and will be GC'ed later, but the C++ object is already freed, so this issue doesn't happen as much. Technically the not-yet-closed RocksDB heap objects are still using a little bit more memory than the JVM is aware of, but if there's enough free memory, this isn't a problem.
This 3rd party wiki here explains this: https://github.com/EighteenZi/rocksdb_wiki/blob/master/RocksJava-Basics.md#memory-management