Open johnkeates opened 7 years ago
Interestingly, on line 335 in the iSCSITarget RA, it states:
# lio distinguishes between targets and target portal
# groups (TPGs). We will always create one TPG, with the
# number 1. In lio, creating a network portal
# automatically creates the corresponding target if it
# doesn't already exist.
So this was already known...
Digging around some more: on line 334 there seems to be a case where a loop containing an if/else block might create a default target for the tpg if a default portal was found:
for portal in ${OCF_RESKEY_portals}; do
if [ $portal != ${OCF_RESKEY_portals_default} ] ; then
IFS=':' read -a sep_portal <<< "$portal"
ocf_run targetcli /iscsi/${OCF_RESKEY_iqn}/tpg1/portals create "${sep_porta$
else
ocf_run targetcli /iscsi create ${OCF_RESKEY_iqn} || exit $OCF_ERR_GENERIC
fi
done
This is later on not checked before the actual function that is supposed to create the target and this causes the issue.
So specifying a portal="0.0.0.0.0:3260" or no portal at all wil cause iSCSITarget to fail since LIO-T will have created a target automatically before the RA reaches the point where it wants to create the target itself.
Manually enumerating all the portals for a target resolves this, but isn't really what you want.
On Fri, Aug 25, 2017 at 07:45:22AM -0700, John Keates wrote:
The iSCSITarget RA never succeeds in automatically starting a target. It seems to first create the target and then tries to create it again (which obviously fails),
The start action must be idempotent, so this is already a problem. The relevant error is this:
ERROR: This Target already exists in configFS
and then exit with error 1. Checking targetcli shows the target, so it's not exiting gracefully either.
Manually starting it twice with pcs resource debug-start does make it work, but then it still fails in a different way: it never adds the target portal so no connections from initiators can be made.
Did you try to post to the pacemaker ML? You may get more audience there about RA behaviour.
Did you open a bug with Debian?
any news or how to fix that?
Not from me, sorry. We moved our setup away from HA Clustered to HA load-balanced with plenty of spare capacity to have stuff fail without impact.
Hit this bug too on centos 7.6.1810
Patch fixes the startup. But I needed changing the resource with pcs resource update
The patch in PR 1239 above should fix this issue for you.
How can I see what version CentOS is using ? The agent is in package: resource-agents-4.1.1-12.el7_6.8.x86_64
It isnt available in a release yet, but you should be able patch it manually, or report to CentOS that the patch should be applied to their current version.
Thanks, I have it patched manually for now. But I want to make sure when I upgrade the package it is with the right version. Is the automatic portal generation also fixed ?
The iSCSITarget RA never succeeds in automatically starting a target. It seems to first create the target and then tries to create it again (which obviously fails), and then exit with error 1. Checking targetcli shows the target, so it's not exiting gracefully either.
Manually starting it twice with pcs resource debug-start does make it work, but then it still fails in a different way: it never adds the target portal so no connections from initiators can be made.
The target primitive is about as simple as it gets:
corosync: 2.4.2 pacemaker: 1.1.16 targetcli-fb: 2.1.43 OS: Debian 9.1
The RA scripts were one revision behind the ones in this repo, the only difference was targetcli lock file sharing between LUN and Target setup. I replaced the ones I had with the ones from the repo, but that didn't change anything (and I didn't really expect it to).