Congyuwang / RocksDict

Python fast on-disk dictionary / RocksDB & SpeeDB Python binding
https://congyuwang.github.io/RocksDict/rocksdict.html
MIT License
179 stars 8 forks source link

Multi thread/process #19

Closed ryanfaircloth closed 1 year ago

ryanfaircloth commented 2 years ago

I have a process writing entries to the dict I have multiple threads attempting to read values from the dict however I find if the client threads open the dict before the entries are created we get a key not found how do inform consumers its safe to read from disk?

Congyuwang commented 2 years ago

I assume that you have one instance of the dict opened (because you can't open RocksDB when one instance already have the LOCK). So you are sharing the dict among python threads (threads in threading package).

In this case, as long as the_dict["a"] = "b" is actually executed by Python before reading the_dict["a"], it should be fine. The underlying RocksDB guarantees that the key should be found whether in-memory cache or on-disk.

Congyuwang commented 2 years ago

Is the reading and writing fine if you read and write your data in single-thread?

ryanfaircloth commented 2 years ago

For this use case there are two processes. One R/W and the second R/O

Congyuwang commented 1 year ago

I think this cannot be done. RocksDB write its data first in a Memtable, which is not on the disk, and thus queryable only by the R/W process, whereas the R/O process will have to wait until the flush (either automatic or manual). The R/O can only wait until the flushing, and there is no simple method for the R/W to tell the R/O a key is ready immediately. In other words, there will be a delay.