open-iscsi / tcmu-runner

A daemon that handles the userspace side of the LIO TCM-User backstore.
Apache License 2.0
189 stars 149 forks source link

IO return to zero #553

Open AresChen1 opened 5 years ago

AresChen1 commented 5 years ago

Env: tcmu-runner 1.4.1, centos 7.5, kernel 3.10, ceph 13.2.5

Reprodouce: 1, Create a ceph pool that named "pool1" and set quota to 10GB. 2, Create three image that named "pool1_image0", "pool1_image1", "pool1_image2" on "pool1". These three images be setted size to 20GB. 3, Create another ceph pool that named "pool2" and set quota to 10GB. 4, Create three image that named "pool2_image0", "pool2_image1", "pool2_image2" on "pool2". These three images be setted size to 20GB. 5, Create a target and add three gateway in this target. Add a initiator in this target. "pool1_image0" and "pool2_image0" set "ALUA state: Active/optimized" on the gateway1 "pool1_image1" and "pool2_image1" set "ALUA state: Active/optimized" on the gateway2 "pool1_image2" and "pool2_image2" set "ALUA state: Active/optimized" on the gateway3 So, the image0's lock is on gateway1, image1's lock is on gateway2, image2's lock is on gateway3. 6, Map these six luns to the initiator. 7,Use "device-mapper-multipath" and "iscsi-initiator-utils" to login the target. 8, Use "fio" to write data to the multipath device. 9, When "pool1" is full, all image's IO is return to zero.(Include pool2)

Root cause: When tcmu-runner got timeout error code from librbd, it will disable the tpg. So the tcp connection from initiator will be closed. If the three gateways get the timeout error code, these three tpgs will be disabled. So all the image's IO will be return to zero.

How can I fix this? Whether can I don't disable the tpg?

mikechristie commented 5 years ago

When you wrote "all image's IO is return to zero" do you mean all IOs will be failed by the multipath layer when the no_path_retry timer in /etc/multipath.conf expires?

Or, do you mean the IO returns successfully, but for READ commands the data in the buffer is zeros when it should be some non-zero data?

mikechristie commented 5 years ago

Also, can you tell me if you are using the ceph-iscsi tools like gwcli to setup or using targetcli directly?

If you are using ceph-iscsi could you tell me the version?

AresChen1 commented 5 years ago

When you wrote "all image's IO is return to zero" do you mean all IOs will be failed by the multipath layer when the no_path_retry timer in /etc/multipath.conf expires?

Or, do you mean the IO returns successfully, but for READ commands the data in the buffer is zeros when it should be some non-zero data?

It can't read and write data into the multipath device. Like this:

[root@ldap ~]# fio -filename=/dev/mapper/mpathg -direct=1 -iodepth 64 -thread -rw=randwrite -ioengine=libaio -bs=64K -size=1G -numjobs=1 -runtime=900 -group_reporting -name=mytest
mytest: (g=0): rw=randwrite, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=libaio, iodepth=64 fio-3.1 Starting 1 thread Jobs: 1 (f=1): [w(1)][60.0%][r=0KiB/s,w=0KiB/s][r=0,w=0][eta 12m:00s]

It can't read and write data into the multipath device that exported from the "pool2" that is not full.Because when "tcmu-runner" got a "timeout" error code from librbd of "pool1" that is full. It will disable the tpg in tcmu_notify_conn_lost of tcmu-runner project. If there are three multipath device from "pool1" and there alua belong to one of three gateways independently. The three tpgs will be disabled.

AresChen1 commented 5 years ago

Also, can you tell me if you are using the ceph-iscsi tools like gwcli to setup or using targetcli directly?

If you are using ceph-iscsi could you tell me the version?

Use ceph-iscsi tools that version is 3.0. Use targetcli to list like this:

o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- block .................................................................................................. [Storage Objects: 0] | o- fileio ................................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................................ [Storage Objects: 0] | o- user:glfs .............................................................................................. [Storage Objects: 0] | o- user:qcow .............................................................................................. [Storage Objects: 0] | o- user:rbd ............................................................................................... [Storage Objects: 1] | | o- testpool.testimage .............................................. [testpool/testimage;osd_op_timeout=30 (5.0GiB) activated] | | o- alua ................................................................................................... [ALUA Groups: 4] | | o- ano2 ............................................................................... [ALUA state: Active/non-optimized] | | o- ano3 ............................................................................... [ALUA state: Active/non-optimized] | | o- ao ..................................................................................... [ALUA state: Active/optimized] | | o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized] | o- user:zbc ............................................................................................... [Storage Objects: 0] o- iscsi ............................................................................................................ [Targets: 1] | o- iqn.2019-06.com.xitcorp.iscsi-gw:7669ac11e98c61a1 ................................................................. [TPGs: 3] | o- tpg1 ........................................................................................................... [disabled] | | o- acls .......................................................................................................... [ACLs: 0] | | o- luns .......................................................................................................... [LUNs: 1] | | | o- lun0 ................................................................................... [user/testpool.testimage (ao)] | | o- portals .................................................................................................... [Portals: 1] | | o- 10.0.100.12:3260 ................................................................................................. [OK] | o- tpg2 ........................................................................................................... [disabled] | | o- acls .......................................................................................................... [ACLs: 0] | | o- luns .......................................................................................................... [LUNs: 1] | | | o- lun0 ................................................................................. [user/testpool.testimage (ano2)] | | o- portals .................................................................................................... [Portals: 1] | | o- 10.0.100.11:3260 ................................................................................................. [OK] | o- tpg3 .......................................................................................... [no-gen-acls, auth per-acl] | o- acls .......................................................................................................... [ACLs: 1] | | o- iqn.1994-05.com.redhat:a92e9f6e6e80 ...................................................... [1-way auth, Mapped LUNs: 1] | | o- mapped_lun0 ..................................................................... [lun0 user/testpool.testimage (rw)] | o- luns .......................................................................................................... [LUNs: 1] | | o- lun0 ................................................................................. [user/testpool.testimage (ano3)] | o- portals .................................................................................................... [Portals: 1] | o- 10.0.100.10:3260 ................................................................................................. [OK] o- loopback ......................................................................................................... [Targets: 0] o- vhost ............................................................................................................ [Targets: 0] o- xen-pvscsi ....................................................................................................... [Targets: 0]

mikechristie commented 5 years ago

Ok, here are the options.

  1. You can configure a target per pool. That way each LUN's iscsi sessions will be separate. One pool hanging and timing out due to a pool full issue will not affect the other target and its pool.

2.. You can run the gwcli disk reconfigure command to set the osd_op_timeout to a high value.

  1. If you want to blanket turn it off then I will need to send a patch to modify the code, because if you try osd_op_timeout=0 then some other code is going to kick in and try to adjust it.
AresChen1 commented 5 years ago

Ok, here are the options.

  1. You can configure a target per pool. That way each LUN's iscsi sessions will be separate. One pool hanging and timing out due to a pool full issue will not affect the other target and its pool.

2.. You can run the gwcli disk reconfigure command to set the osd_op_timeout to a high value.

  1. If you want to blanket turn it off then I will need to send a patch to modify the code, because if you try osd_op_timeout=0 then some other code is going to kick in and try to adjust it.

Can I comments the code that set tpg's enable to 0 in tgt_port_grp_recovery_thread_fn?

mikechristie commented 5 years ago

Yeah, as a temp hack that will be ok.