Congyuwang / RocksDict

Python fast on-disk dictionary / RocksDB & SpeeDB Python binding
https://congyuwang.github.io/RocksDict/rocksdict.html
MIT License
173 stars 8 forks source link

Exception: Corruption: VersionBuilder: Cannot delete table file #75038 from level 1 since it is not in the LSM tree #108

Closed Menziess closed 6 months ago

Menziess commented 7 months ago

This exception was raised, and I found this rocksdb related test:

TEST_F(VersionBuilderTest, ApplyFileDeletionNotInLSMTree) {
  UpdateVersionStorageInfo();

  EnvOptions env_options;
  constexpr TableCache* table_cache = nullptr;
  constexpr VersionSet* version_set = nullptr;

  VersionBuilder builder(env_options, &ioptions_, table_cache, &vstorage_,
                         version_set);

  VersionEdit edit;

  constexpr int level = 3;
  constexpr uint64_t file_number = 1234;

  edit.DeleteFile(level, file_number);

  const Status s = builder.Apply(&edit);
  ASSERT_TRUE(s.IsCorruption());
  ASSERT_TRUE(std::strstr(s.getState(),
                          "Cannot delete table file #1234 from level 3 since "
                          "it is not in the LSM tree"));
}

Where should I look to find the root of the issue? Is it caused by my code, or in RocksDict, or simply rockdb? It happened just once, out of all services that have been running for a long time.

Any help would be appreciated.

Congyuwang commented 7 months ago

Looks like this is caused by an extra LSM file that is not in the tree structure? Perhaps you should try to delete the file numbered as 75038 (first backup it).

Menziess commented 7 months ago

I haven't seen it since it happened a while ago. Would a crash be able to cause this error? E.g; compaction is running, file is deleted, crash occurs, file is not yet removed from LSM tree.

Congyuwang commented 6 months ago

Yeah, I think likely.