My ZFS pool contains an SSD special device. I use recordsize=1M. Now, I created a new dataset that has special_small_blocks=1M set. Therefore, any files that I put into this dataset will be entirely placed on the SSD. Indeed, I can confirm, looking at zpool list -v, that files put into this dataset are indeed located only on the SSD. This is good if one wants particularly fast access for certain files.
Now I created a ZFS volume, also inside this dataset. The volume uses 16K blocksize. I expected that, when I mount the volume and write to it, as the blocks are 16K and therefore much smaller than the record size, should also end up only on the special device. However, this is not the case; instead, the data in the ZVOL is handled like any other data, i.e. it is distributed amongst the hard disks and I only see some metadata being put onto the SSD.
I thought that, using special_small_blocks=<recordsize>, is one elegant way of dividing a pool into "fast" datasets that entirely live on the special device, and "normal" datasets with just their metadata being held on the special device. While this is indeed true for ordinary files, it does not work for ZVOLs. Why is this?
Describe how to reproduce the problem
How to reproduce the problem:
I created a pool with the following config (ashift=12 for all devices, compression=zstd, special_small_blocks=0K):
# zpool status
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0
special
mirror-1 ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
errors: No known data errors
# zfs get recordsize tank
NAME PROPERTY VALUE SOURCE
tank recordsize 128K default
next, I created two datasets. The "ssdonly" dataset has the special_small_blocks set equal to the record size, therefore I expect everything that is put into this dataset will end up only on the special device. On the other hand, the "hdd" dataset has special_small_bocks=0K and therefore, absolutely only the metadata ends up on the special device. To see how data is distributed amongst the individual vdevs, I also add the output of zpool list -v below:
Good. Now I create a bunch of random test files and put them into the "ssdonly" dataset. We can clearly see that ALL data is put into the special device, and none is on the regular devices, as expected because the "ssdonly" dataset is configured to do so:
The same tests are repeated with the "hdd" dataset, and it can be verified that the "hdd" data is being stored on the regular vdevs.
Now I create a zvol inside the ssdonly dataset and, using the above tests, expect that the entire data of the zvol is being held only on the special device, which can be verified is not the case.
Describe the problem you're observing
My ZFS pool contains an SSD special device. I use recordsize=1M. Now, I created a new dataset that has special_small_blocks=1M set. Therefore, any files that I put into this dataset will be entirely placed on the SSD. Indeed, I can confirm, looking at
zpool list -v
, that files put into this dataset are indeed located only on the SSD. This is good if one wants particularly fast access for certain files.Now I created a ZFS volume, also inside this dataset. The volume uses 16K blocksize. I expected that, when I mount the volume and write to it, as the blocks are 16K and therefore much smaller than the record size, should also end up only on the special device. However, this is not the case; instead, the data in the ZVOL is handled like any other data, i.e. it is distributed amongst the hard disks and I only see some metadata being put onto the SSD.
I thought that, using
special_small_blocks=<recordsize>
, is one elegant way of dividing a pool into "fast" datasets that entirely live on the special device, and "normal" datasets with just their metadata being held on the special device. While this is indeed true for ordinary files, it does not work for ZVOLs. Why is this?Describe how to reproduce the problem
How to reproduce the problem:
I created a pool with the following config (ashift=12 for all devices, compression=zstd, special_small_blocks=0K):
next, I created two datasets. The "ssdonly" dataset has the special_small_blocks set equal to the record size, therefore I expect everything that is put into this dataset will end up only on the special device. On the other hand, the "hdd" dataset has special_small_bocks=0K and therefore, absolutely only the metadata ends up on the special device. To see how data is distributed amongst the individual vdevs, I also add the output of
zpool list -v
below:Good. Now I create a bunch of random test files and put them into the "ssdonly" dataset. We can clearly see that ALL data is put into the special device, and none is on the regular devices, as expected because the "ssdonly" dataset is configured to do so:
The same tests are repeated with the "hdd" dataset, and it can be verified that the "hdd" data is being stored on the regular vdevs.
Now I create a zvol inside the ssdonly dataset and, using the above tests, expect that the entire data of the zvol is being held only on the special device, which can be verified is not the case.
I wonder if this is a bug or a feature.