Open DongyuanPan opened 6 years ago
We are just starting to investigate performance.
One known issue is that for LIO and open-iscsi you need to have node.session.cmds_max match the LIO default_cmdsn_depth setting. If they are not the same, then there seems to be a bug on the initiator side where IOs are requeued and do not get retried quickly like normal.
There is another issue for latency/IOPs type of tests where one command slows others. The attached patch
is a hack around it but it needs work because it can cause extra switches.
For target_core_user there are other issues like its memory allocation in the main path, but you might not be hitting that with the fio arguments you are using.
Thank you~ @mikechristie
I retested the performance with tcmu-runner-1.3.0 and set node.session.cmds_max match the LIO default_cmdsn_depth.There are some improvements in performance( 18.8 K IOPS),and the performance the same as using TGT. If I use tcmu-runner-1.3.0 without optimization . Is this normal for the performance(the same as TGT)?
There is another issue for latency/IOPs type of tests where one command slows others. The attached patch runner-dont-wait.txt is a hack around it but it needs work because it can cause extra switches.
if I test with the patch,the performance (32K IOPS) approach the RBD itself. But the patch is only for tests? The argument wakeup is determined by aio_track->tracked_aio_ops. AIO must be tracked ? What might occur if I do not track AIO ?Can this parameter be specified by the user?
Thanks for testing.
if I test with the patch,the performance (32K IOPS) approach the RBD itself. But the patch is only for tests?
Yeah, the patch needs some cleanup, because of what you notice below.
The argument wakeup is determined by aio_track->tracked_aio_ops. AIO must be tracked ? What might occur if I do not track AIO ?Can this parameter be specified by the user?
It is used during failover/failback and recovery to make sure IOs are not being executed in the handler modules (handler_rbd, handler_glfs, etc) when we execute a callout like lock() or (re)open().
So ideally, we have these issues:
If (!wakeup && current_batch_wait > batch_timeout)
tcmulib_processing_complete(dev);
@Github641234230 I noticed that your kernel version is 3.10.0-693.11.6.el7.x86_64. Did you add some patch for your kernel? Are you going to be HA? My kernel is 3.10.0-693.11.6.el7.x86_64,tcmu-runner-1.3.0-re4. There is IOERROE when modify the kernel parameter enable = 1.
@mikechristie If our product can only use CentOS 7.4 (3.10.0-693.11.6.el7.x86_64) and I want to do HA.what should I do?Which patch I can use?
I've tried using targetcli to export RBDs to all gateway nodes,On the iscsi client side, I use dm-multipath to find it and it can work well( both Active/Active and active/passive).Is there any problem using this method for HA? And this issue https://github.com/open-iscsi/tcmu-runner/issues/356 Active/Active is not supported. I am very confused.
@MIZZ122
For upstream tcmu-runner/ceph-iscsi-cli HA support you have to use RHEL 7.5 beta or newer kernel or this kernel:
https://github.com/ceph/ceph-client.
HA is only supported with active/passive. You must use the settings here
http://docs.ceph.com/docs/master/rbd/iscsi-initiators/
Just because dm-multipath let's you setup active/active does not mean it is safe. You can end up with data corruption. Use the settings in the docs.
If you are doing single node (non HA) then you can do active/active across multiple portals on that one node.
@MIZZ122 if you have other questions about active/active can you open a new issues or discuss it in the issue for active/active. This issue is for perf only.
@mikechristie Any update on this issue?
@lxbsz was testing it out for gluster with the perf team. lxbsz, did it help and did you make the changes I requested and were they needed or was it ok to just always just complete right away?
It looks like you probably got busy with resize so I can do the changes. Are you guys working with the perf team still, so we can get them tested?
@mikechristie Yes, we and the perf team together test this.
The environment is base PostgreSQL database when running on Gluster Block Volume in a CNS environment.
1, by changing node.session.cmds_max to match the LIO default_cmdsn_depth. The performance improved just very small improvement, about 5%?
2, by https://github.com/open-iscsi/tcmu-runner/files/1654757/runner-dont-wait.txt The performance improved about 10%.
3, by changing the default_cmdsn_depth to 64: The performance improved about 27%.
So we are preparing to have a more test about this later. These days we are busy with the RHGS's release.
Ok, assume this is back on me.
We will test this by mixing them up later once we have enough time.
Can I use this (https://github.com/open-iscsi/tcmu-runner/files/1654757/runner-dont-wait.txt) patch for production ESXi environment? If not recommended, how i can help you to investigate performance for fix it? I have all needed hardware
It is perfectly safe crash wise but might cause other regressions. If you can test, I can give you a patch later this week that makes it configurable so we can try to figure out if there is some balance between the 2 extreme settings being used with and without the patch or if it might need to be configurable for the type of workload.
Ok, i'am waiting patch and instruction for how test it (ceph, tcmu-runner, FIO)
In my test environment for ceph rbd, the tgt perf is better than lio-tcmu. So I created an IBLOCK backstore from a /dev/sda block device by Targetcli in order to test the LIO perf without tcmu/tcmu-runner.
4K rand_write LIO+SSD DISK -> IOPS=48.9k, BW=191MiB/s TGT+SSD DISK -> IOPS=49.2k, BW=192MiB/s
4K rand_read LIO+SSD DISK -> IOPS=44.9k, BW=175MiB/s TGT+SSD DISK -> IOPS=46.5k, BW=182MiB/s
64K write LIO+SSD DISK ->IOPS=6221, BW=389MiB/s TGT+SSD DISK -> IOPS=9100, BW=569MiB/s
64K read LIO+SSD DISK ->IOPS=8389, BW=524MiB/s TGT+SSD DISK ->IOPS=19.3k, BW=1208MiB/s
The perf of TGT is better than LIO. It's strange. Thanks for any help anyone can provide!
@mikechristie In my ceph cluster, the throughput of the scsi disks is much lower than RBD's. I run the LIO iscsi gateway in vm with kernel version '4.16.0-0.rc6'. In the vm, I compaired the performance of tcmu-runner with KRBD using fio util(sync=1, -ioengine=psync -bs=4M -numjobs=10).
4M seq write & one LIO gw for a RBD
KRBD
LIO + TCMU
BW=409MiB, avg lat = 97ms
BW=131MiB, avg lat = 305ms
TGT+rbd_bs
BW=362MiB, avg lat = 110ms
4M seq read & one LIO gw for a RBD
KRBD
BW=1571MiB, avg lat = 25ms
LIO + TCMU
BW=256MiB, avg lat = 155ms
TGT+rbd_bs
BW=1556MiB, avg lat = 26ms
4M seq write & one LIO gw for four RBDs
KRBD
BW=205MiB, avg lat = 190ms
LIO + TCMU
BW=42MiB, avg lat = 921ms
TGT+rbd_bs
BW=193MiB, avg lat = 206ms
4M seq read & one LIO gw for four RBDs
KRBD
BW=416MiB, avg lat = 96ms
LIO + TCMU
BW=148MiB, avg lat = 270ms
TGT+rbd_bs
BW=397MiB, avg lat = 100ms
I have a poor throughput for scsi disk using TCMU, is this having something to do with the
For target_core_user there are other issues like its memory allocation in the main path
as you say ?
@mikechristie Has the runner-dont-wait.txt patch already been merged to 1.4RC1?
Yes.
@mikechristie I am having performance issue with EC RBD as backend store. I am using 1.4RC1.KRBD seq write speed is about 600MB/s.TCUM+RBD seq write speed is around 30MB/s.
hi @shadowlinyf ,will you test it again afterwards, is tcmu still very poor?
now i meet the same performance like this , fio with rbd ,the result was about 500MB/s , if with tcmu of user:rbd , the fio test result was about 15MB/s ,this performance is too poor , my env is :kernel -5.0.4 , tcmu -lasest release 1.4.1 , ceph - 12.2.11
now i meet the same performance like this , fio with rbd ,the result was about 500MB/s , if with tcmu of user:rbd , the fio test result was about 15MB/s ,this performance is too poor , my env is :kernel -5.0.4 , tcmu -lasest release 1.4.1 , ceph - 12.2.11
i meet the same performance like this. I seem to have solved this my problem. Although there are other performance issues. we can try to use gwcli to set the following parameters for disk: /disks> reconfigure blockpool/image01 hw_max_sectors 8192 /disks> reconfigure blockpool/image01 max_data_area_mb 128 After setting,the performance of tcmu can approximate the performance of librbd in HHD scenarios
Hi~ I am a senior university student and I've been learning ceph and iscsi recently.
I'm using fio to test the performance of the RBD,but performance degradation when using RBDs with LIO-TCMU.
My test is mainly about the performance of the RBD as a target using LIO_TCMU、the performance of the RBD itself (no iSCSI or LIO-TCMU)、the performance of the RBD as a target using TGT.
Details about the test environment:
I use targetcli(or tgtadm) to create target device and use initiator to login it.And then,I use fio to test the device. 1)the performance of the RBD itself (no iSCSI or LIO-TCMU) rbd create image-10 --size 102400 (rbd default features = 3) fio test config
performance: 35-40 K IOPS
2)the performance of the RBD as a target using TGT. create lun: tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 --backing-store rbd/image-10 --bstype rbd
initiator iscsiadm -m node --targetname iqn.2018-01.com.example02:iscsi -p 192.168.x.x:3260 -l
the lun was mounted as /dev/sdw
fio test
performance: 18-20K IOPS
3)the performance of the RBD as a target using LIO_TCMU use targetcli to create lun and tpg default_cmdsn_depth=512. initiator side node.session.cmds_max = 2048 node.session.queue_depth = 1024
/dev/sdv backend image-10
performance: 7K IOPS
I found an issue similar to me, but I still haven't found the problem http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-October/044021.html http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-December/045347.html
Thanks for any help anyone can provide!