Congyuwang / RocksDict

Python fast on-disk dictionary / RocksDB & SpeeDB Python binding
https://congyuwang.github.io/RocksDict/rocksdict.html
MIT License
176 stars 8 forks source link

Corrupt or unsupported format_version #33

Closed vhrtanek closed 1 year ago

vhrtanek commented 1 year ago

Hi there,

I keep receiving this error while trying to create or access any new or existing Rdict. Could you please suggest, how to solve that?

Traceback (most recent call last): File "/home/main.py", line 53, in <module> db.ingest_external_file(["test.db/temp1.sst"]) Exception: Corruption: Corrupt or unsupported format_version: 33554440

This is the result of db.repair(path)

Exception: IO error: lock hold by current process, acquire time 1669990343 acquiring thread 139711774802560: tempDict1/LOCK: No locks available

I was running tests with approx. 100M records and some processes crashed. Could it cause the lock and corruption? Thanks for help

Congyuwang commented 1 year ago

To fix the lock problem encountered when calling db.repair, try calling db.close() first.

vhrtanek commented 1 year ago

It helped with db.repair, but the original error still occurred. Do you have advice there?

This is the code i run

import json
from rocksdict import Rdict, Options, SstFileWriter
import random

rdictName = "tempDict1"

value = {
    "bkey": "0000000000000000",
    "hkey": "00000000000000000000000000000000",
    "timestamp": "0000000000000000"
}

byteValue = json.dumps(value).encode('utf-8')

key_rand_bytes1 = [random.randbytes(16) for _ in range(1000000)]
key_rand_bytes1.sort()

writer = SstFileWriter(Options())
writer.open("test.db/temp1.sst")

for k in key_rand_bytes1:
    writer[k] = byteValue

db = Rdict(rdictName)
db.ingest_external_file(["test.db/temp1.sst"])

db.close()
db.repair(rdictName)
Rdict.destroy(rdictName)
Congyuwang commented 1 year ago

Hi, you have to call writer.finish() after writing to the writer. Now, you won't need repair.

import json
from rocksdict import Rdict, Options, SstFileWriter
import random

rdictName = "tempDict1"

value = {
    "bkey": "0000000000000000",
    "hkey": "00000000000000000000000000000000",
    "timestamp": "0000000000000000"
}

byteValue = json.dumps(value).encode('utf-8')

key_rand_bytes1 = [random.randbytes(16) for _ in range(1000000)]
key_rand_bytes1.sort()

writer = SstFileWriter(Options())
writer.open("test.db/temp1.sst")

for k in key_rand_bytes1:
    writer[k] = byteValue

writer.finish()

db = Rdict(rdictName)
db.ingest_external_file(["test.db/temp1.sst"])

db.close()
Congyuwang commented 1 year ago

See example: https://github.com/Congyuwang/RocksDict/blob/5bec3c24df3bb90a771f97841d6b3ead83f7de5d/examples/sst_file_write.py#L27