westerndigitalcorporation / zenfs

ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
GNU General Public License v2.0
235 stars 86 forks source link

file lifetime calculation in zenfs #251

Closed kekeMemory closed 1 year ago

kekeMemory commented 1 year ago

Hello, I am now confused in zenfs zone allocation part: Hope for detail guides, thanks a lot~~~

Q1. In the function to build SST CalculateSSTWriteHint(0) ,is the returned result to the variable Env::WriteLifeTimeHint file_lifetime

#\db\flush_job.cc
Status FlushJob::WriteLevel0Table() {
  AutoThreadOperationStageUpdater stage_updater(
      ThreadStatus::STAGE_FLUSH_WRITE_L0);
  db_mutex_->AssertHeld();
  const uint64_t start_micros = clock_->NowMicros();
  const uint64_t start_cpu_micros = clock_->CPUMicros();
  Status s;

  SequenceNumber smallest_seqno = mems_.front()->GetEarliestSequenceNumber();
  if (!db_impl_seqno_time_mapping_.Empty()) {
    // make a local copy, as the seqno_time_mapping from db_impl is not thread
    // safe, which will be used while not holding the db_mutex.
    seqno_to_time_mapping_ = db_impl_seqno_time_mapping_.Copy(smallest_seqno);
  }

  std::vector<BlobFileAddition> blob_file_additions;

  {
    auto write_hint = cfd_->CalculateSSTWriteHint(0);

...
#\fs\zbd_zenfs.cc
IOStatus ZonedBlockDevice::GetBestOpenZoneMatch(
    Env::WriteLifeTimeHint file_lifetime, unsigned int *best_diff_out,
    Zone **zone_out, uint32_t min_capacity) {

Q2. In the above variable of file_lifetime, this file is means the"ZoneFile"?

Q3. About the WriteHint Value: L0 is 0? The enum 0-5 means Level 0 to Level 5? If yes, in this case, the level count is fixed to 5?

  enum WriteLifeTimeHint {
    WLTH_NOT_SET = 0,  // No hint information set
    WLTH_NONE,         // No hints about write life time
    WLTH_SHORT,        // Data written has a short life time
    WLTH_MEDIUM,       // Data written has a medium life time
    WLTH_LONG,         // Data written has a long life time
    WLTH_EXTREME,      // Data written has an extremely long life time
  };
attack204 commented 1 year ago

I think I can answer your question Q1 Yes Q2 Yes Q3 File in level 0 or 1 is Medium, level 2 is Long, and the upper level (3 or larger) is Extreme. The Write-Ahead Log and Manifest have the Short HINT

kekeMemory commented 1 year ago

I think I can answer your question Q1 Yes Q2 Yes Q3 File in level 0 or 1 is Medium, level 2 is Long, and the upper level (3 or larger) is Extreme. The Write-Ahead Log and Manifest have the Short HINT

Thanks for your answer