ceph / ceph-iscsi

Ceph iSCSI tools
GNU General Public License v3.0
62 stars 59 forks source link

lun: skip defining the lun when image not found #252

Closed lxbsz closed 2 years ago

lxbsz commented 2 years ago

@idryomov Fixed them all. Please take a look. thanks.

idryomov commented 2 years ago

Could you explain the purpose of "common: add sub item support for update_item" commit? It wasn't present in the previous changeset -- what is the problem you are trying to solve there?

And regardless of the reasoning behind it, there are about 40 existing update_item calls. Instead of touching almost all of the them to pass an extra None argument, I'd introduce a new update_subitem method and use it in that one or two places where you are passing in something other than None.

lxbsz commented 2 years ago

I see you introduced update_sub_item, but it's not called by anything other than update_item (which passes None for item_name). To me this means that either something is broken or all update_item-related changes are bogus.

Sorry, I pushed this from wrong node. Updated it.

Continuing that thought, you haven't explained the purpose of "common: add sub item support for update_item" commit. What it does, why it's needed, is it a bug fix or an optimization, etc? If it's just an optimization/cleanup, let's move it to a separate PR to not stall the "image not found" bug fixes.

I just added more comments in the commit comments and code comments.

When recovery the config from the rbd image and LIO device, because the image size may changed, for example if old size is 1M and new size is 2M, then we need to sync this changes to other gateways.

Actually the first 3 commits could fix the bz. The last 2 are one further optimization. I am okay to move them into a separate PR.

lxbsz commented 2 years ago

Continuing that thought, you haven't explained the purpose of "common: add sub item support for update_item" commit.

For this, because only the auth gateway could change the relevant config["disks"][disk_key], and for all the other gateways it's readonly. Since the rbd image size could change after manually created, so after recovering the disk config we need to let other gateways to properly set the new size to LIO device. I need to propagate this info via the config["disks"][disk_key], and after other gateways finishing this they need to remove the related info from the config["disks"][disk_key].

The "recovery" item will include the ip list of none-auth gateways.

idryomov commented 2 years ago

Actually the first 3 commits could fix the bz. The last 2 are one further optimization. I am okay to move them into a separate PR.

Is it the first 2 commits or the first 3 commits? There are only 2 commits in this PR now... Should "lun: try to recovery the config from the LIO" commit be re-added?

lxbsz commented 2 years ago

Actually the first 3 commits could fix the bz. The last 2 are one further optimization. I am okay to move them into a separate PR.

Is it the first 2 commits or the first 3 commits? There are only 2 commits in this PR now... Should "lun: try to recovery the config from the LIO" commit be re-added?

I just now checked the BZ, actually the last 3 commits will fix the second bugs mentioned in the BZ.

For now I just simplified the last 3 commit and appended it here.

lxbsz commented 2 years ago

@idryomov My test is done. Please take a look. Thanks.

idryomov commented 2 years ago

Otherwise LGTM. Any idea why the tox job is stuck Pending?

lxbsz commented 2 years ago

Otherwise LGTM. Any idea why the tox job is stuck Pending?

Not sure. it seems long time ago ?

idryomov commented 2 years ago

@lxbsz LGTM, I'm experimenting with getting the tox tests back.

idryomov commented 2 years ago

jenkins tox