ceph / ceph-iscsi

Ceph iSCSI tools
GNU General Public License v3.0
60 stars 59 forks source link

ceph-iscsi / tcmu-runner bad pefromance with vmware esxi #246

Open lightmans2 opened 2 years ago

lightmans2 commented 2 years ago

Hello together,

i need some help on our ceph 16.2.5 cluster as iscsi target with esxi nodes

background infos:

esxi iscsi config: esxcli system settings advanced set -o /ISCSI/MaxIoSizeKB -i 512 esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunQDepth=64 esxcli system module parameters set -m iscsi_vmk -p iscsivmk_HostQDepth=64 esxcli system settings advanced set --int-value 1 --option /DataMover/HardwareAcceleratedMove

the osd nodes, mons, rgw/iscsi gateways and esxi nodes are all connected to the 10gbit network with bond-rr

rbd benchmark test:

root@cd133-ceph-osdh-01:~# rados bench -p rbd 10 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_cd133-ceph-osdh-01_87894
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16        69        53   211.987       212    0.250578    0.249261
    2      16       129       113   225.976       240    0.296519    0.266439
    3      16       183       167   222.641       216    0.219422    0.273838
    4      16       237       221   220.974       216    0.469045     0.28091
    5      16       292       276   220.773       220    0.249321     0.27565
    6      16       339       323   215.307       188    0.205553     0.28624
    7      16       390       374   213.688       204    0.188404    0.290426
    8      16       457       441   220.472       268    0.181254    0.286525
    9      16       509       493   219.083       208    0.250538    0.286832
   10      16       568       552   220.772       236    0.307829    0.286076
Total time run:         10.2833
Total writes made:      568
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     220.941
Stddev Bandwidth:       22.295
Max bandwidth (MB/sec): 268
Min bandwidth (MB/sec): 188
Average IOPS:           55
Stddev IOPS:            5.57375
Max IOPS:               67
Min IOPS:               47
Average Latency(s):     0.285903
Stddev Latency(s):      0.115162
Max latency(s):         0.88187
Min latency(s):         0.119276
Cleaning up (deleting benchmark objects)
Removed 568 objects
Clean up completed and total clean up time :3.18627

the rbd benchmark says that min 250 mb/s is possible... but i saw realy much more... up to 550mb/s

if i start iftop on one osd node i see the ceph iscsi gw names as rgw and the traffic is nearly 80mb/s grafik

the ceph dashboard shows that the write iscsi performance are only 40mb/s the max value i saw was between 40 and 60mb/s.. very poor grafik

if i look into the vcenter and esxi datastore performance i see very high storage device latencys between 50 and 100ms... very bad grafik

root@cd133-ceph-mon-01:/home/cephadm# ceph config dump
WHO                                               MASK       LEVEL     OPTION                                       VALUE                                                                                        RO
global                                                       basic     container_image                              docker.io/ceph/ceph@sha256:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  *
global                                                       advanced  journal_max_write_bytes                      1073714824
global                                                       advanced  journal_max_write_entries                    10000
global                                                       advanced  mon_osd_cache_size                           1024
global                                                       dev       osd_client_watch_timeout                     15
global                                                       dev       osd_heartbeat_interval                       5
global                                                       advanced  osd_map_cache_size                           128
global                                                       advanced  osd_max_write_size                           512
global                                                       advanced  rados_osd_op_timeout                         5
global                                                       advanced  rbd_cache_max_dirty                          134217728
global                                                       advanced  rbd_cache_max_dirty_age                      5.000000
global                                                       advanced  rbd_cache_size                               268435456
global                                                       advanced  rbd_op_threads                               2
  mon                                                        advanced  auth_allow_insecure_global_id_reclaim        false
  mon                                                        advanced  cluster_network                              10.50.50.0/24                                                                                *
  mon                                                        advanced  public_network                               10.50.50.0/24                                                                                *
  mgr                                                        advanced  mgr/cephadm/container_init                   True                                                                                         *
  mgr                                                        advanced  mgr/cephadm/device_enhanced_scan             true                                                                                         *
  mgr                                                        advanced  mgr/cephadm/migration_current                2                                                                                            *
  mgr                                                        advanced  mgr/cephadm/warn_on_stray_daemons            false                                                                                        *
  mgr                                                        advanced  mgr/cephadm/warn_on_stray_hosts              false                                                                                        *
  mgr                                                        advanced  mgr/dashboard/10.50.50.21/server_addr                                                                                                     *
  mgr                                                        advanced  mgr/dashboard/ALERTMANAGER_API_HOST          http://10.221.133.161:9093                                                                   *
  mgr                                                        advanced  mgr/dashboard/GRAFANA_API_SSL_VERIFY         false                                                                                        *
  mgr                                                        advanced  mgr/dashboard/GRAFANA_API_URL                https://10.221.133.161:3000                                                                  *
  mgr                                                        advanced  mgr/dashboard/ISCSI_API_SSL_VERIFICATION     true                                                                                         *
  mgr                                                        advanced  mgr/dashboard/NAME/server_port               80                                                                                           *
  mgr                                                        advanced  mgr/dashboard/PROMETHEUS_API_HOST            http://10.221.133.161:9095                                                                   *
  mgr                                                        advanced  mgr/dashboard/PROMETHEUS_API_SSL_VERIFY      false                                                                                        *
  mgr                                                        advanced  mgr/dashboard/RGW_API_ACCESS_KEY             W8VEKVFDK1RH5IH2Q3GN                                                                         *
  mgr                                                        advanced  mgr/dashboard/RGW_API_SECRET_KEY             IkIjmjfh3bMLrPOlAFbMfpigSIALAQoKGEHzZgxv                                                     *
  mgr                                                        advanced  mgr/dashboard/camdatadash/server_addr        10.251.133.161                                                                               *
  mgr                                                        advanced  mgr/dashboard/camdatadash/ssl_server_port    8443                                                                                         *
  mgr                                                        advanced  mgr/dashboard/cd133-ceph-mon-01/server_addr                                                                                               *
  mgr                                                        advanced  mgr/dashboard/dasboard/server_port           80                                                                                           *
  mgr                                                        advanced  mgr/dashboard/dashboard/server_addr          10.251.133.161                                                                               *
  mgr                                                        advanced  mgr/dashboard/dashboard/ssl_server_port      8443                                                                                         *
  mgr                                                        advanced  mgr/dashboard/server_addr                    0.0.0.0                                                                                      *
  mgr                                                        advanced  mgr/dashboard/server_port                    8080                                                                                         *
  mgr                                                        advanced  mgr/dashboard/ssl                            false                                                                                        *
  mgr                                                        advanced  mgr/dashboard/ssl_server_port                8443                                                                                         *
  mgr                                                        advanced  mgr/orchestrator/orchestrator                cephadm
  mgr                                                        advanced  mgr/prometheus/server_addr                   0.0.0.0                                                                                      *
  mgr                                                        advanced  mgr/telemetry/channel_ident                  true                                                                                         *
  mgr                                                        advanced  mgr/telemetry/contact                        hf@yy.de                                                                                *
  mgr                                                        advanced  mgr/telemetry/description                    ceph cluster                                                                         *
  mgr                                                        advanced  mgr/telemetry/enabled                        true                                                                                         *
  mgr                                                        advanced  mgr/telemetry/last_opt_revision              3                                                                                            *
  osd                                                        dev       bluestore_cache_autotune                     false
  osd                                             class:ssd  dev       bluestore_cache_autotune                     false
  osd                                                        dev       bluestore_cache_size                         4000000000
  osd                                             class:ssd  dev       bluestore_cache_size                         4000000000
  osd                                                        dev       bluestore_cache_size_hdd                     4000000000
  osd                                                        dev       bluestore_cache_size_ssd                     4000000000
  osd                                             class:ssd  dev       bluestore_cache_size_ssd                     4000000000
  osd                                                        advanced  bluestore_default_buffered_write             true
  osd                                             class:ssd  advanced  bluestore_default_buffered_write             true
  osd                                                        advanced  osd_max_backfills                            1
  osd                                             class:ssd  dev       osd_memory_cache_min                         4000000000
  osd                                             class:hdd  basic     osd_memory_target                            6000000000
  osd                                             class:ssd  basic     osd_memory_target                            6000000000
  osd                                                        advanced  osd_recovery_max_active                      3
  osd                                                        advanced  osd_recovery_max_single_start                1
  osd                                                        advanced  osd_recovery_sleep                           0.000000
    client.rgw.ceph-rgw.cd133-ceph-rgw-01.klvrwk             basic     rgw_frontends                                beast port=8000                                                                              *
    client.rgw.ceph-rgw.cd133-ceph-rgw-01.ptmqcm             basic     rgw_frontends                                beast port=8001                                                                              *
    client.rgw.ceph-rgw.cd88-ceph-rgw-01.czajah              basic     rgw_frontends                                beast port=8000                                                                              *
    client.rgw.ceph-rgw.cd88-ceph-rgw-01.pdknfg              basic     rgw_frontends                                beast port=8000                                                                              *
    client.rgw.ceph-rgw.cd88-ceph-rgw-01.qkdlfl              basic     rgw_frontends                                beast port=8001                                                                              *
    client.rgw.ceph-rgw.cd88-ceph-rgw-01.tdsxpb              basic     rgw_frontends                                beast port=8001                                                                              *
    client.rgw.ceph-rgw.cd88-ceph-rgw-01.xnadfr              basic     rgw_frontends                                beast port=8001                                                                              *

can somebody explain me what i am doing wrong or what can i do to get a better performance with ceph-iscsi? doesnt matter what i do or what i tweak the write performance will not get better.

i already experimented with gwcli and the iscsi queue and other settings. actually i set: hw_max_sectors 8192 max_data_area_mb 32 cmdsn_depth 64 / the esxi nodes are alredy set fixed to 64 max iscsi commands

everything is fine and multipathing is workind and the recovery is fast ... but the iscsi very slow and i dont know why. can somebody help me maybe?

breeze-cool commented 2 years ago

Try to turn off multipath or turn off the feature exclusive lock