axboe / fio

Flexible I/O Tester
GNU General Public License v2.0
5.01k stars 1.23k forks source link

fdp: support scheme placement id selection #1757

Closed parkvibes closed 1 month ago

parkvibes commented 1 month ago

fdp: support scheme placement id (index) selection

Add a new placement id selection method called scheme. It allows users to assign a placement ID (index) depending on the offset range. The strategy of the scheme is specified in the file by user and is applicable using the option dp_scheme.

Signed-off-by: Hyunwoo Park dshw.park@samsung.com

t/nvmept_fdp: add tests(302,303,400,401) for fdp scheme

Signed-off-by: Hyunwoo Park dshw.park@samsung.com

vincentkfu commented 1 month ago

Hyunwoo, although Ankit and I have shared feedback on your patch, it's not appropriate to include our Signed-off-by tags on your scheme patch since we are not co-authors.

Please update the commit message to remove the tags for Ankit and me.

For patches accepted via the mailing list maintainers add a Signed-off-by after accepting the patch but those should not be added by the patch authors.

For more details see https://www.kernel.org/doc/html/latest/process/submitting-patches.html?highlight=signed%20off.

parkvibes commented 1 month ago

Vincent, I updated messages for commit as well as PR in accordance with your guide.

vincentkfu commented 1 month ago

When I run the test against a QEMU FDP device I see:

root@localhost:~/fio-dev/1757/fio# python3 t/nvmept_fdp.py -f ./fio --dut /dev/ng0n1 --debug -o 400 401
Artifact directory is nvmept-fdp-test-20240513-154330
fio path is /root/fio-dev/1757/fio/fio
Test 1 SKIPPED (User request or override)
Test 2 SKIPPED (User request or override)
Test 3 SKIPPED (User request or override)
Test 4 SKIPPED (User request or override)
Test 5 SKIPPED (User request or override)
Test 6 SKIPPED (User request or override)
Test 7 SKIPPED (User request or override)
Test 8 SKIPPED (User request or override)
Test 9 SKIPPED (User request or override)
Test 10 SKIPPED (User request or override)
Test 11 SKIPPED (User request or override)
Test 12 SKIPPED (User request or override)
Test 13 SKIPPED (User request or override)
Test 14 SKIPPED (User request or override)
Test 100 SKIPPED (User request or override)
Test 101 SKIPPED (User request or override)
Test 102 SKIPPED (User request or override)
Test 103 SKIPPED (User request or override)
Test 200 SKIPPED (User request or override)
Test 201 SKIPPED (User request or override)
Test 202 SKIPPED (User request or override)
Test 203 SKIPPED (User request or override)
Test 300 SKIPPED (User request or override)
Test 301 SKIPPED (User request or override)
Test 302 SKIPPED (User request or override)
Test 303 SKIPPED (User request or override)
DEBUG:root:Test 400: return code: 0
DEBUG:root:plid_list: [1, 2, 3]
DEBUG:root:plid_index_set_in_scheme: {0, 1, 2}
DEBUG:root:plid_list_referenced_by_scheme: [1, 2, 3]
DEBUG:root:pid 0 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 8192 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16384 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24576 should not be touched by the scheme. ruamw of it 32 equals to 32
ERROR:root:pid 1 should not be touched by the scheme. ruamw of it(32) should equals to 32
DEBUG:root:pid 8193 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16385 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24577 should not be touched by the scheme. ruamw of it 32 equals to 32
ERROR:root:pid 2 should not be touched by the scheme. ruamw of it(32) should equals to 32
DEBUG:root:pid 8194 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16386 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24578 should not be touched by the scheme. ruamw of it 32 equals to 32
ERROR:root:pid 3 should not be touched by the scheme. ruamw of it(32) should equals to 32
DEBUG:root:pid 8195 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16387 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24579 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:Test 400: stderr:

DEBUG:root:Test 400: stdout:

Test 400 FAILED:  400
DEBUG:root:Test 401: return code: 0
DEBUG:root:plid_list: [1, 2, 3]
DEBUG:root:plid_index_set_in_scheme: {0, 1, 2}
DEBUG:root:plid_list_referenced_by_scheme: [1, 2, 3]
DEBUG:root:pid 0 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 8192 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16384 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24576 should not be touched by the scheme. ruamw of it 32 equals to 32
ERROR:root:pid 1 should not be touched by the scheme. ruamw of it(32) should equals to 32
DEBUG:root:pid 8193 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16385 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24577 should not be touched by the scheme. ruamw of it 32 equals to 32
ERROR:root:pid 2 should not be touched by the scheme. ruamw of it(32) should equals to 32
DEBUG:root:pid 8194 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16386 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24578 should not be touched by the scheme. ruamw of it 32 equals to 32
ERROR:root:pid 3 should not be touched by the scheme. ruamw of it(32) should equals to 32
DEBUG:root:pid 8195 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 16387 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:pid 24579 should not be touched by the scheme. ruamw of it 32 equals to 32
DEBUG:root:Test 401: stderr:

DEBUG:root:Test 401: stdout:

Test 401 FAILED:  401
0 test(s) passed, 2 failed, 26 skipped

This is the relevant part of my QEMU invocation:

        -device "nvme-subsys,id=nvme-subsys0,fdp=on,fdp.runs=128K,fdp.nrg=4,fdp.nruh=8" \
        -device "nvme,id=nvme0,serial=deadbeef,subsys=nvme-subsys0" \
        -drive "id=nvm-1,file=nvme.img,format=raw,if=none,discard=unmap,media=disk" \
        -device "nvme-ns,id=nvm-1,drive=nvm-1,bus=nvme0,nsid=1,logical_block_size=4096,physical_block_size=4096,fdp.ruhs=0;5;6;7" \
parkvibes commented 1 month ago

@vincentkfu, 400 & 401 tests are simplified including some bug-fixes.

vincentkfu commented 1 month ago

Please squash the final two patches.

The test still does not work. The code assumes that each of the listed PLIDs will be touched but that's not the case.

Fio will write 64K with holes of size 64K between each write. With these contents for lba.scheme,

0, 131072, 0
131072, 524288, 2
524288, 1048576, 1

plid 0 will receive one write but plid 2 will receive two writes.

# /root/fio-dev/1757/fio/fio  --name=nvmept-fdp  --ioengine=io_uring_cmd  --cmd_type=nvme  --randrepeat=0  --filename=/dev/ng0n1  --rw=write:65536  --bs=65536  --fdp=1  --fdp_pli=1,2,3  --fdp_pli_select=scheme  --dp_scheme=lba.scheme  --number_ios=3 --debug=io
fio: set debug option io
nvmept-fdp: (g=0): rw=write, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=io_uring_cmd, iodepth=1
fio-3.37
Starting 1 process
io       14824 invalidate not supported /dev/ng0n1
io       14824 dtype set to 0x2, dspec set to 0x2000
io       14824 fill: io_u 0x55e475352c40: off=0x0,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 prep: io_u 0x55e475352c40: off=0x0,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 prep: io_u 0x55e475352c40: ret=0
io       14824 queue: io_u 0x55e475352c40: off=0x0,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 calling ->commit(), depth 1
io       14824 io_u_queued_complete: min=1
io       14824 getevents: 1
io       14824 complete: io_u 0x55e475352c40: off=0x0,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 dtype set to 0x2, dspec set to 0x6000
io       14824 fill: io_u 0x55e475352c40: off=0x20000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 prep: io_u 0x55e475352c40: off=0x20000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 prep: io_u 0x55e475352c40: ret=0
io       14824 queue: io_u 0x55e475352c40: off=0x20000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 calling ->commit(), depth 1
io       14824 io_u_queued_complete: min=1
io       14824 getevents: 1
io       14824 complete: io_u 0x55e475352c40: off=0x20000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 dtype set to 0x2, dspec set to 0x6000
io       14824 fill: io_u 0x55e475352c40: off=0x40000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 prep: io_u 0x55e475352c40: off=0x40000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 prep: io_u 0x55e475352c40: ret=0
io       14824 queue: io_u 0x55e475352c40: off=0x40000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 calling ->commit(), depth 1
io       14824 io_u_queued_complete: min=1
io       14824 getevents: 1
io       14824 complete: io_u 0x55e475352c40: off=0x40000,len=0x10000,ddir=1,file=/dev/ng0n1
io       14824 close ioengine io_uring_cmd
io       14824 free ioengine io_uring_cmd

nvmept-fdp: (groupid=0, jobs=1): err= 0: pid=14824: Fri May 17 17:00:30 2024
  write: IOPS=1500, BW=93.8MiB/s (98.3MB/s)(192KiB/2msec); 0 zone resets
    slat (usec): min=45, max=125, avg=77.05, stdev=42.69
    clat (usec): min=297, max=561, avg=409.81, stdev=136.51
     lat (usec): min=343, max=687, avg=486.86, stdev=179.04
    clat percentiles (usec):
     |  1.00th=[  297],  5.00th=[  297], 10.00th=[  297], 20.00th=[  297],
     | 30.00th=[  297], 40.00th=[  371], 50.00th=[  371], 60.00th=[  371],
     | 70.00th=[  562], 80.00th=[  562], 90.00th=[  562], 95.00th=[  562],
     | 99.00th=[  562], 99.50th=[  562], 99.90th=[  562], 99.95th=[  562],
     | 99.99th=[  562]
  lat (usec)   : 500=66.67%, 750=33.33%
  cpu          : usr=0.00%, sys=100.00%, ctx=3, majf=0, minf=15
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,3,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=93.8MiB/s (98.3MB/s), 93.8MiB/s-93.8MiB/s (98.3MB/s-98.3MB/s), io=192KiB (197kB), run=2-2msec
vincentkfu commented 1 month ago

Oh, I actually didn't appreciate this originally. Wouldn't it be easier for the scheme file to provide the placement indices directly? I now see that the scheme file provides indices to be used to select from the placement ids listed in the plids option. This seems overly complicated and the documentation is not clear that this is how the placement IDs are actually chosen.

Why not just use the indices in the scheme file directly as placement ID indices? Then when plid_select=scheme, fio can just ignore the plids option.

parkvibes commented 1 month ago

In fact, from the point of view of using the index, I was in the same position as you, but I chose this method because I copied the methods of random & RR. As you said, I agree that dealing with the index may cause confusion in the case of the scheme.

I'll update the code by directly assigning dspec from the value of scheme file. Also, plids/fdp_pli options will be ignored when plid_select/fdp_pli_select=scheme.

lba.scheme will be also refined for all plids to be touched.

vincentkfu commented 1 month ago

There actually is a problem with write:65536

root@localhost:~/fio-dev/1757/fio# python3 t/nvmept_fdp.py --dut /dev/ng0n1 --fio ./fio -o 400 401
Artifact directory is nvmept-fdp-test-20240520-172623
fio path is /root/fio-dev/1757/fio/fio
Test 1 SKIPPED (User request or override)
Test 2 SKIPPED (User request or override)
Test 3 SKIPPED (User request or override)
Test 4 SKIPPED (User request or override)
Test 5 SKIPPED (User request or override)
Test 6 SKIPPED (User request or override)
Test 7 SKIPPED (User request or override)
Test 8 SKIPPED (User request or override)
Test 9 SKIPPED (User request or override)
Test 10 SKIPPED (User request or override)
Test 11 SKIPPED (User request or override)
Test 12 SKIPPED (User request or override)
Test 13 SKIPPED (User request or override)
Test 14 SKIPPED (User request or override)
Test 100 SKIPPED (User request or override)
Test 101 SKIPPED (User request or override)
Test 102 SKIPPED (User request or override)
Test 103 SKIPPED (User request or override)
Test 200 SKIPPED (User request or override)
Test 201 SKIPPED (User request or override)
Test 202 SKIPPED (User request or override)
Test 203 SKIPPED (User request or override)
Test 300 SKIPPED (User request or override)
Test 301 SKIPPED (User request or override)
Test 302 SKIPPED (User request or override)
Test 303 SKIPPED (User request or override)
ERROR:root:Unhandled rw value write:65536
Test 400 FAILED:  400
ERROR:root:Unhandled rw value write:65536
Test 401 FAILED:  401
0 test(s) passed, 2 failed, 26 skipped

The problem is in _check_result():

        if self.fio_opts['rw'] in ['read', 'randread']:
            self.passed = self.check_all_ddirs(['read'], job)
        elif self.fio_opts['rw'] in ['write', 'randwrite']:
            if 'verify' not in self.fio_opts:
                self.passed = self.check_all_ddirs(['write'], job)
            else:
                self.passed = self.check_all_ddirs(['read', 'write'], job)
        elif self.fio_opts['rw'] in ['trim', 'randtrim']:
            self.passed = self.check_all_ddirs(['trim'], job)
        elif self.fio_opts['rw'] in ['readwrite', 'randrw']:
            self.passed = self.check_all_ddirs(['read', 'write'], job)
        elif self.fio_opts['rw'] in ['trimwrite', 'randtrimwrite']:
            self.passed = self.check_all_ddirs(['trim', 'write'], job)
        else:
            logging.error("Unhandled rw value %s", self.fio_opts['rw'])
            self.passed = False

The original code did not anticipate this usage of the rw option. Probably the easiest thing to do would be to strip the :65536 from fio_opts['rw'] and then carry out the comparison using the stripped value of the rw option.