openzfsonwindows / ZFSin

OpenZFS on Windows port
https://openzfsonwindows.org
1.2k stars 69 forks source link

Deleting a file on NTFS formatted dedup-enabled zvol does not reduce the dedup ratio #306

Open vrajendra-datacore opened 3 years ago

vrajendra-datacore commented 3 years ago

I created dedup and compression enabled zvol and formatted it with NTFS.

After copying a file (of size ~1GB) 3 times,

C:\Users\Administrator\Desktop\zfs>zfs list
NAME       USED  AVAIL  REFER  MOUNTPOINT
tank      2.96G  27.5G   128K  /tank
tank/vol  2.95G  27.5G  2.95G  -

C:\Users\Administrator\Desktop\zfs>zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
tank  29.5G  1.00G  28.5G        -         -     0%     3%  3.00x  ONLINE  -

C:\Users\Administrator\Desktop\zfs>zdb.exe -DD tank
DDT-sha256-zap-duplicate: 16091 entries, size 444 on disk, 143 in core
DDT-sha256-zap-unique: 26 entries, size 274116 on disk, 88379 in core

DDT histogram (aggregated over all DDTs):

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1       26   1.62M    288K    288K       26   1.62M    288K    288K
     2    15.7K   1006M   1004M   1004M    47.1K   2.95G   2.94G   2.94G
   128        1     64K      4K      4K      255   15.9M   1020K   1020K
 Total    15.7K   1007M   1004M   1004M    47.4K   2.96G   2.94G   2.94G

dedup = 3.00, compress = 1.01, copies = 1.00, dedup * compress / copies = 3.02

After deleting (permanent delete) one file, the dedup ratio remains unchanged. The REFER'ed space is not reclaimed.

C:\Users\Administrator\Desktop\zfs>zfs list
NAME       USED  AVAIL  REFER  MOUNTPOINT
tank      2.96G  27.5G   128K  /tank
tank/vol  2.95G  27.5G  2.95G  -

C:\Users\Administrator\Desktop\zfs>zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
tank  29.5G  1.00G  28.5G        -         -     0%     3%  3.00x  ONLINE  -

C:\Users\Administrator\Desktop\zfs>zdb.exe -DD tank
DDT-sha256-zap-duplicate: 16091 entries, size 444 on disk, 143 in core
DDT-sha256-zap-unique: 26 entries, size 274116 on disk, 88379 in core

DDT histogram (aggregated over all DDTs):

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1       26   1.62M    288K    288K       26   1.62M    288K    288K
     2    15.7K   1006M   1004M   1004M    47.1K   2.95G   2.94G   2.94G
   128        1     64K      4K      4K      255   15.9M   1020K   1020K
 Total    15.7K   1007M   1004M   1004M    47.4K   2.96G   2.94G   2.94G

dedup = 3.00, compress = 1.01, copies = 1.00, dedup * compress / copies = 3.02

Is there a way to update the dedup ratio using UNMAP/TRIM?

imtiazdc commented 3 years ago

@lundman any insights on how to fix or investigate this? Really appreciate your help on this one.

adamdmoss commented 3 years ago

zvol can't know about deletions on NTFS (at least until openzfs2/TRIM support) - NTFS isn't physically overwriting the data on the deleted blocks - so at the block level the data is still there and still dedup'd. Eventually NTFS will overwrite those blocks with new data, which is when they'll be un-dedup'd.

adamdmoss commented 3 years ago

(Also FYI dedup is always a bad idea for performance, if that's a factor you care about.)

lundman commented 3 years ago

We have TRIM support, but I don't remember if I filled in the parts that call unmap in zvol.c for Windows yet. https://github.com/openzfsonwindows/ZFSin/search?q=zvol_unmap

adamdmoss commented 3 years ago

We have TRIM support, but I don't remember if I filled in the parts that call unmap in zvol.c for Windows yet. https://github.com/openzfsonwindows/ZFSin/search?q=zvol_unmap

That's great news - but my win10 doesn't seem to believe that the zvol virtual device supports TRIM, last time I checked using instructions such as https://www.howtogeek.com/257196/how-to-check-if-trim-is-enabled-for-your-ssd-and-enable-it-if-it-isnt/ . Maybe I was using an ancient build; will test again shortly.

lundman commented 3 years ago

Nah, I mean - I think we are missing the bit in storport that advertise that we do trim, then relays the discard request to zvol_unmap(). The pieces are there, they just aren't connected yet....

vrajendra-datacore commented 3 years ago

@lundman I did try sending explicit scsi commands with sg3_utils (windows port). I didnt see scsiop_unmap being intercepted by zfsin. Alternatively, I tried passing scsi_write_same16 which was intercepted (the command line tool is sg3_write_same; Need to provide starting lba and number of blocks), from there I called zvol_unmap eventually but I did not see the reclaimation happening at zvol (when checked zfs list output) after deleting a file.

lundman commented 3 years ago

Finally got around to looking deeper into this https://git.io/JO3zU

should hopefully take the incoming requests and pass them along to zvol_os_unmap.

I need to figure out what you did to test it though, or see if I can fool vmware into saying it has an SSD instead of HDD