Open mzygQAQ opened 2 weeks ago
Status BlobStorage::Get(const ReadOptions& options, const BlobIndex& index,
BlobRecord* record, PinnableSlice* buffer) {
auto sfile = FindFile(index.file_number).lock();
if (!sfile)
return Status::Corruption("Missing blob file: " +
std::to_string(index.file_number));
// NOTE-1: the purge obselete file thread can delete the file in this time, and the next line will report the error
return file_cache_->Get(options, sfile->file_number(), sfile->file_size(),
index.blob_handle, record, buffer);
}
I have checked the code here and there is indeed a race condition present
@v01dstar Hello, can you help confirm
At first glance, seems possible, allow me dig more.
I think this is indeed a problem, unless we set skip_value_in_compaction_filter
to be true, however, we don't. I am surprise that we don't see this error in our users' environment. If I didn't miss anything, this is more than a race condition. Since compaction filter does not go through the normal read path (i.e. read with a snapshot), this should happen quite frequently.
I guess that in the TIDB environment, Tikv only uses Compaction Filter in WriteCF, while WriteCF only saves some transaction commit information and small values less than 256 bytes. Moreover, by default, WriteCF does not enable Titan, so it will not occur. This issue occurs in scenarios where Tikv is used with Rawkv or directly with Titan.
I guess that in the TIDB environment, Tikv only uses Compaction Filter in WriteCF, while WriteCF only saves some transaction commit information and small values less than 256 bytes. Moreover, by default, WriteCF does not enable Titan, so it will not occur. This issue occurs in scenarios where Tikv is used with Rawkv or directly with Titan.
Yes, I totally missed that. I guess, you can leverate skip_value_in_compaction_filter
in this case. Or you can propose a simple fix, which as you suggested, and also mentioned in the TODO, i.e. return corresponding error to the caller of Get()
, and the caller (compaction filter) decide what to do.
i try
This error is different with the existing older Missing blob.