westerndigitalcorporation / zenfs

ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
GNU General Public License v2.0
238 stars 87 forks source link

Why diff is not 0 when zone and file share the same lifetime? #217

Closed Haltz closed 2 years ago

Haltz commented 2 years ago

https://github.com/westerndigitalcorporation/zenfs/blob/22f110fa8b7b71a2a3467502177ea70241402916/fs/zbd_zenfs.cc#L490

Hi, guys! Recently I have read some source code of ZenFS. I can get the core idea of allocating zones for sst. But what I can't follow is why diff is not 0 when zone and file share the same lifetime?

Hope for some teaching! Thanks in advance!

yhr commented 2 years ago

Hi @Haltz - the allocator is design to prioritize filling up zones with data that has a shorter life span than of the data already written in the zone to ensure that the new data wont prolong the reclaim time. Only in cases where we don't have any choice (the active zone limit has been reached) we pick zones with LIFETIME_DIFF_COULD_BE_WORSE.

Haltz commented 2 years ago

Thang you for explaining @yhr. But I'm not sure I understand the benefit of this design, so I hope for further teaching.

For SSTs with Env::WLTH_SHORT, the allocator always allocates an empty zone if possible for them because the diff returned is LIFETIME_DIFF_COULD_BE_WORSE. If SSTs with Env::WLTH_SHORT consumes all active zones, zenfs would have to write SST with different WLTH to one selected active zone. Is there a chance that SSTs with various lifetime are mixed in a zone, making it's hard to achieve " the new data wont prolong the reclaim time"?

yhr commented 2 years ago

@Haltz: ZenFS will co-locate WLTH_SHORT data if all active zones are used (due to the allocator returning LIFETIME_DIFF_COULD_BE_WORSE). If zenfs can't find a decent life time match(NOT_GOOD) it will finish one of the active zones to achieve data separation.

yhr commented 2 years ago

@Haltz : Did my previous answer make sense? Can we close this issue?

Haltz commented 2 years ago

Sorry, I will check my github more frequently. But now I still have a question. What do you mean by "co-locate WLTH_SHORT data"? Like transferring WLTH_SHORT ssts from other zones to one zone at once(I did not find the related code, please point me out if I'm wrong)? Or opening an empty zone by finishing an active zone and during GC zenfs can put them together?

yhr commented 2 years ago

@Haltz : If the active zone limit has been reached, data with the same life time(e.g WLTH_SHORT) will be co-located . See https://github.com/westerndigitalcorporation/zenfs/blob/12b0f637877fb083491b37b53d0ecb87f77ce1b9/fs/zbd_zenfs.cc#L841

Haltz commented 2 years ago

So from your words can I assume that if the active zone limit has been reached, the data will find itself a zone with the most close lifetime because LIFETIME_NOT_GOOD can only be returned when file lifetime is none or not set? In this case, SSTables from different levels will share the same zone because they always has a LIFETIME(> None or NOT_SET). The data seperation is not good I think.

yhr commented 2 years ago

See https://github.com/westerndigitalcorporation/zenfs/blob/12b0f637877fb083491b37b53d0ecb87f77ce1b9/fs/zbd_zenfs.cc#L490 LIFETIME_NOT_GOOD is returned if lifetime(file) > lifetime(zone)

Haltz commented 2 years ago

Thank you for being patient! I think we can close the issue now.