ceph / ceph-iscsi-cli

NOTICE: moved to https://github.com/ceph/ceph-iscsi
GNU General Public License v3.0
25 stars 25 forks source link

Disk defined to the config, but image can not be found in pool #108

Open longlnk opened 6 years ago

longlnk commented 6 years ago

Hi,

I got the error after rebooting one of the two gateways and I can't start rbd-target-api.service

Jul 15 12:02:58 cephgw journal: Processing LUN configuration Jul 15 12:02:58 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd.disk_1' to LIO Jul 15 12:02:58 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd.disk_1 to LIO Jul 15 12:02:58 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd.disk_2' to LIO Jul 15 12:02:58 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd.disk_2 to LIO Jul 15 12:02:59 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd.disk_3' to LIO Jul 15 12:02:59 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd.disk_3 to LIO Jul 15 12:02:59 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd1.disk_1' to LIO Jul 15 12:02:59 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd1.disk_1 to LIO Jul 15 12:02:59 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd2.disk_1' to LIO Jul 15 12:03:00 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd2.disk_1 to LIO Jul 15 12:03:00 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd2.disk_2' to LIO Jul 15 12:03:00 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd2.disk_2 to LIO Jul 15 12:03:00 cephgw journal: (LUN.add_dev_to_lio) Adding image 'rbd2.disk_3' to LIO Jul 15 12:03:00 cephgw journal: (LUN.add_dev_to_lio) Successfully added rbd2.disk_3 to LIO Jul 15 12:03:00 cephgw journal: Disk 'rbd2.disk_4' defined to the config, but image 'disk_4' can not be found in 'rbd2' pool Jul 15 12:03:00 cephgw journal: Removing Ceph iSCSI configuration from LIO Jul 15 12:03:00 cephgw journal: Removing iSCSI target from LIO Jul 15 12:03:00 cephgw journal: Removing LUNs from LIO Jul 15 12:03:00 cephgw journal: Active Ceph iSCSI gateway configuration removed Jul 15 12:03:00 cephgw systemd: rbd-target-gw.service: main process exited, code=exited, status=16/n/a Jul 15 12:03:00 cephgw systemd: Unit rbd-target-gw.service entered failed state. Jul 15 12:03:00 cephgw journal: Shutdown received Jul 15 12:03:00 cephgw systemd: rbd-target-gw.service failed. Jul 15 12:03:00 cephgw systemd: Stopping Ceph iscsi target configuration API... Jul 15 12:03:00 cephgw systemd: Stopped Ceph iscsi target configuration API. Jul 15 12:03:00 cephgw systemd: rbd-target-gw.service holdoff time over, scheduling restart. Jul 15 12:03:00 cephgw systemd: start request repeated too quickly for rbd-target-gw.service Jul 15 12:03:00 cephgw systemd: Failed to start Setup system to export rbd images through LIO. Jul 15 12:03:00 cephgw systemd: Dependency failed for Ceph iscsi target configuration API. Jul 15 12:03:00 cephgw systemd: Job rbd-target-api.service/start failed with result 'dependency'.

The information in gwcli on other gateway. 1 gateway is inaccessible - updates will be disabled /disks> ls o- disks ........................................................................................................... [40G, Disks: 8] o- rbd.disk_1 ...................................................................................................... [disk_1 (5G)] o- rbd.disk_2 ...................................................................................................... [disk_2 (5G)] o- rbd.disk_3 ...................................................................................................... [disk_3 (5G)] o- rbd1.disk_1 ..................................................................................................... [disk_1 (5G)] o- rbd2.disk_1 ..................................................................................................... [disk_1 (5G)] o- rbd2.disk_2 ..................................................................................................... [disk_2 (5G)] o- rbd2.disk_3 ..................................................................................................... [disk_3 (5G)] o- rbd2.disk_4 ..................................................................................................... [disk_4 (5G)] /disks> cd

The information in rbd [root@cephgw2 ~]# rbd ls -l rbd2 NAME SIZE PARENT FMT PROT LOCK disk_1 5120M 2 disk_2 5120M 2 disk_3 5120M 2 disk_4 5120M 2

I installed the package: OS: Centos 7.5, kernel: 4.16

ceph-iscsi-cli-2.6-1.el7.centos.noarch.rpm ceph-iscsi-config-2.5-1.el7.centos.noarch.rpm python-rtslib-2.1.67-1.noarch.rpm targetcli-fb-2.1.fb48-1.noarch.rpm tcmu-runner-1.3.0-rc4.el7.centos.x86_64.rpm

I can't use ceph iscsi 2.7 because it error "ValueError: No JSON object could be decoded"

Please help me :(

dillaman commented 6 years ago

Sounds like you deleted the disk using the rbd CLI or perhaps you encountered an error when trying to delete it via gwcli. Either way, the easiest solution is to just run rbd create --pool rbd2 --size 5G disk_4

longlnk commented 6 years ago

I didn't delete disk by rbd cli and I didn't delete it via gwcli.

I have two gateway the steps I did:

  1. Create disk_1, disk_2, disk_3, disk_4 on rbd pool by gwcli
  2. Create disk_1, disk_2 on rbd1 pool by gwcli
  3. Create disk_1, disk_2, disk_3, disk_4 on rbd2 pool by gwcli
  4. I restarted cephgw1
  5. Error message appears (As mentioned in the first post) and I can not start rbd-target-api.service on cephgw1
  6. I use rbd cli, I saw disk_4 in rbd2 pool (As mentioned in the first post)
  7. I use gwcli, I saw disk_4 in rbd2 pool (As mentioned in the first post)
  8. I restart cephgw2 -> error as step 5

I made 3 times, errors occur 1 time (probability of 30 percent)

I do not know how to use "rbd create --pool rbd2 --size 5G disk_4" there data loss? :(

dillaman commented 6 years ago

The error message you pasted above is that it cannot find disk_4 in the rbd2 pool. Are you saying that the image does exist in the pool and that gwcli is incorrectly saying it doesn’t exist?

longlnk commented 6 years ago

Yes, I used rbd cli, i saw disk_4 in rbd2

[root@cephosd ~]# rbd ls -l rbd2 NAME SIZE PARENT FMT PROT LOCK disk_1 5120M 2 disk_2 5120M 2 disk_3 5120M 2 disk_4 5120M 2

After, I run sudo rbd-target-gw, sudo rbd-target-api on console, I use gwcli and saw disk_4 in rbd pool: 2 gateway is inaccessible - updates will be disabled /disks> ls o- disks ........................................................................................................... [40G, Disks: 8] o- rbd.disk_1 ...................................................................................................... [disk_1 (5G)] o- rbd.disk_2 ...................................................................................................... [disk_2 (5G)] o- rbd.disk_3 ...................................................................................................... [disk_3 (5G)] o- rbd1.disk_1 ..................................................................................................... [disk_1 (5G)] o- rbd2.disk_1 ..................................................................................................... [disk_1 (5G)] o- rbd2.disk_2 ..................................................................................................... [disk_2 (5G)] o- rbd2.disk_3 ..................................................................................................... [disk_3 (5G)] o- rbd2.disk_4 ..................................................................................................... [disk_4 (5G)]

But I can't start rbd-target-api, because it report error "Jul 15 12:03:00 cephgw journal: Disk 'rbd2.disk_4' defined to the config, but image 'disk_4' can not be found in 'rbd2' pool"

dillaman commented 6 years ago

@longlnk If you add "debug = true" under the "[config]" section of "/etc/ceph/iscsi-gateway.cfg", do the logs in "/var/log/rbd-target-api.log" show more details for the failure? You would have to do this on all nodes.