twmht / python-rocksdb

Python bindings for RocksDB
BSD 3-Clause "New" or "Revised" License
276 stars 88 forks source link

Iterator memory leak #23

Open gcp opened 6 years ago

gcp commented 6 years ago
        self.db = rocksdb.DB("test.db", rocksdb.Options(max_open_files=100))
        it = self.db.iteritems()
        it.seek_to_first()
        while True:
            next(it)

This will eventually use up all memory (for a large DB).

gcp commented 6 years ago

Adding fill_cache=False does not change the outcome.

twmht commented 6 years ago

@gcp

How large DB you used?

gcp commented 6 years ago

I guess it's about 56GB of data.

wavenator commented 6 years ago

Happens to me too. Last rocksdb version from github

patkivikram commented 4 years ago

Is anyone working on fixing this?

iFA88 commented 4 years ago

May you have some misconfiguration. I can not reproduce the issue with RocksDB 6.6.4. first.py :

import rocksdb
from time import time
from hashlib import sha512
st = time()
opt = rocksdb.Options(create_if_missing=True)
db = rocksdb.DB("/tmp/rocktest", opt)
wdb = rocksdb.WriteBatch()
for i in range(10000000):
    wdb.put(sha512(i.to_bytes(4, "big")).digest(), sha512(i.to_bytes(8, "big")).digest())
    if i%1000000==0:
        db.write(wdb)
        wdb = rocksdb.WriteBatch()
db.write(wdb)
wdb = rocksdb.WriteBatch()
print(time()-st)
st = time()
db.compact_range()
print(time()-st)
> first.py
172.28729605674744
93.50748825073242

second.py:

import rocksdb
from time import time
st = time()
opt = rocksdb.Options(create_if_missing=True)
db = rocksdb.DB("/tmp/rocktest", opt)
it = db.iterkeys()
it.seek_to_first()
for i in it:
    pass
print(time()-st)
st = time()
> second.py
16.713879823684692

On the first run was max 375mb RES. On the second run was max 50 kb RES.

The DB is 1.3gb big.

patkivikram commented 4 years ago

Do you need to close iterators to release memory?

patkivikram commented 4 years ago

I see this note here

Iterators should be closed explicitly to release resources: Store iterators (e.g., KeyValueIterator and WindowStoreIterator) must be closed explicitly upon completeness to release resources such as open file handlers and in-memory read buffers, or use try-with-resources statement (available since JDK7) for this Closeable class.

Otherwise, stream application’s memory usage keeps increasing when running until it hits an OOM.

This is from the jni version of this library https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html

iFA88 commented 4 years ago

Do you need to close iterators to release memory?

Python will do that, but memory usage did not increasing during iteration loop. Please run my script on your environment to check that what u got.

patkivikram commented 4 years ago

Okay looks like it could be some of the configuration parameters of the memtables. Is https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager implemented in the library?

iFA88 commented 4 years ago

Check my repo https://github.com/iFA88/python-rocksdb I have implemented some extra options.

iFA88 commented 4 years ago

@patkivikram Can you please upload your OPTIONS-XXXXX file?

patkivikram commented 4 years ago

Is this going on master with a new release of python-rocksdb?

iFA88 commented 4 years ago

AFAIK @twmht refactoring the whole project, so i dont think so that this will be pulled. But Without my repo you can set up important options like:

rocksdb.Options(
    table_factory=rocksdb.BlockBasedTableFactory(
        block_size=16384, block_cache=rocksdb.LRUCache(8 << 20, 10)
    ),
    write_buffer_size:16 << 20,
    max_write_buffer_number:1,
    target_file_size_base:64 << 20,

It is important that you give the good numbers to the options. If you have more column family, then you need each config them. I self working with 40-100 column families.

iFA88 commented 4 years ago

Maybe you did not release the iterator, if you use many iterator and you writing time to time into the database it can cause big memory usage. Please install my repo: https://github.com/iFA88/python-rocksdb/tree/iterDebug When an iterator would be created will "Iterator created" when released will print "Iterator released" to the stdout. Try run your database from terminal and check how many it is created and released.

patkivikram commented 4 years ago

Your repo still does not have an option to set BlockBasedTableOptions .cache_index_and_filter_blocks which is necessary to limit the memory impact

iFA88 commented 4 years ago

Try it now

patkivikram commented 4 years ago

is there a version that exists in pypi for your branch?

iFA88 commented 4 years ago

Nope, just git clone it, export the folders and python setup.py install

patkivikram commented 4 years ago

When is a new release with this code expected? Can you merge it to master and do a release?

iFA88 commented 4 years ago

https://github.com/iFA88/python-rocksdb/releases

patkivikram commented 4 years ago

python-rocksdb is at version 0.7. What am I missing? https://pypi.org/project/python-rocksdb/

iFA88 commented 4 years ago

Forget pypi, build from source.

patkivikram commented 4 years ago

@iFA88 Is this supported? https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters

patkivikram commented 4 years ago

We would need this this to prevent unbounded off heap allocation

iFA88 commented 4 years ago

No its not supported, but I dont think that you need this. I use many CF in my own database, and the days I have read all keys from the CF (with iteration) without any memory issue. The CF what I have dumped it was in 786M keys and was 145gb. Memory peak was 2gb.

patkivikram commented 4 years ago

What are your options for the DB? Whats your environment and version? Have you looked at https://github.com/facebook/rocksdb/issues/4112

patkivikram commented 3 years ago

Hey @iFA88 are you able to get read stats histogram in your log files? They are missing in my log files

iFA88 commented 3 years ago

Sorry for late answer. I just saw the rocksdb#4112 issue, very interesting. I have just these stats in the log: (The end is cutted)

** Compaction Stats [table:transactions:index:byIsCoinbase:data] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) C
--------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     23.1      0.03              0.03
  L1      1/0    2.83 MB   0.7      0.0     0.0      0.0       0.0      0.0       0.0   2.0     25.8     25.8      0.11              0.10
  L4      7/0   438.92 MB   0.1      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00
 Sum      8/0   441.75 MB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   4.6     19.7     25.2      0.14              0.14
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00

** Compaction Stats [table:transactions:index:byIsCoinbase:data] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec
--------------------------------------------------------------------------------------------------------------------------------------------
 Low      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0     25.8     25.8      0.11              0.10
High      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0     23.8      0.03              0.03
User      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      2.2      0.00              0.00
Uptime(secs): 498008.6 total, 600.0 interval
Flush(GB): cumulative 0.001, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.1 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pendin

** File Read Latency Histogram By Level [table:transactions:index:byIsCoinbase:data] **
patkivikram commented 3 years ago

so even for you the histogram for read stats is empty? How can we enable this to get data like stats

iFA88 commented 3 years ago

I think with db.get_property(b"rocksdb.stats") or for CF db.get_property(b"rocksdb.stats", column_family=CFHANDLER) should work.

patkivikram commented 3 years ago

it does not have the histogram as it is the same one thats printed in the log files.

iFA88 commented 3 years ago

What you got?

patkivikram commented 3 years ago

Empty File Read Latency Histogram By Level [default] **

iFA88 commented 3 years ago

Okay then i think some debug statistics options re not set which are not supported by python-rocksdb.

patkivikram commented 3 years ago

what are they? https://github.com/facebook/rocksdb/wiki/Statistics Can you think of a quick change for this?

patkivikram commented 3 years ago

I see https://github.com/twmht/python-rocksdb/blob/master/rocksdb/options.pxd#L71

iFA88 commented 3 years ago

Yeah.. PR are welcome :)

patkivikram commented 3 years ago

how do you log to stdout?

patkivikram commented 3 years ago

Also do you rebuild this everytime we change the rocksdb version?

iFA88 commented 3 years ago

I use my own python-rocksdb repo and yes when I change something I need rebuild it. stdout: print() ?

patkivikram commented 3 years ago

I meant how do you set db_log_dir to write to stdout?

patkivikram commented 3 years ago

@iFA88 do you support opening rocksdb as secondary? Do you want to merge this project with faust-streaming? We have a huge community using python-rocksdb but are not able to get good support from the python bindings

iFA88 commented 3 years ago

You can open without any problem so many instances what you want. I have forked this repo for my own purposes but you can use it freely.

patkivikram commented 3 years ago

@iFA88 is there a close method on clean up the DB?

patkivikram commented 3 years ago

@iFA88 I meant does your repo support this feature - https://github.com/twmht/python-rocksdb/issues/50

iFA88 commented 3 years ago

@iFA88 I meant does your repo support this feature - #50

Nope, not supported. But you can open any other rocksdb at the same time.

asad-awadia commented 2 years ago

@gcp did you figure this out? I am seeing the same thing on iterations - even though i am closing the iterator memory keeps growing

gcp commented 2 years ago

I saw no point to continue using rocksdb with such basic bugs that weren't even acknowledged.

asad-awadia commented 2 years ago

@gcp what did you use instead?