Open Smithx10 opened 2 years ago
Thanks for the report.
If the user provides a different service address it probably should just automatically take the next available port
Hm, yes. We would have to read back all the already created targets and check for the highest port_id
... Probably not impossible, but I don't think we have precedent for that kind of logic yet. I will look at it.
If the user provides the same service address it should link it in. This probably should be fixed in resource-agents.
I don't think I fully understand this point. Right now I guess it would create a new portdir
and symlink the subsystem in there. Does the backend not accept this? How would we fix this in the resource agents?
I guess if anything linstor-gateway should look up whether or not there is already a target with the same addr
and assign the same port_id
if there is...
Sorry if I wasn't clear.
I guess if anything linstor-gateway should look up whether or not there is already a target with the same addr and assign the same port_id if there is...
Even if we use the same port_id for the same service_address with the current nvmet-port heartbeat code we will never symlink in the subsystem.
nvmet_port_start() runs nvmet_port_monitor which only checks if the $portdir exists, which it will since we created a port prior and will return 0 and never hit the following:
for subsystem in ${OCF_RESKEY_nqns}; do
ln -s /sys/kernel/config/nvmet/subsystems/${subsystem} \
${portdir}/subsystems/${subsystem}
done
the healthcheck
nvmet_port_monitor() {
[ -d ${portdir} ] || return $OCF_NOT_RUNNING
return $OCF_SUCCESS
}
Perhaps we should run loop where we link even if the portdir exists.
After going through a PoC implementation of this behavior, I discovered that when you have 1 VIP with 4 subsystems, it's possible for reactor to promote the VIP on separate Primaries.
For example: nvme create -r nvme_group linbit:nvme:demo0 10.91.230.214/32 10G nvme create -r nvme_group linbit:nvme:demo1 10.91.230.214/32 10G nvme create -r nvme_group linbit:nvme:demo2 10.91.230.214/32 10G
Can result with demo0 and demo1 on NodeA, and demo2 on NodeB both with the VIP 10.91.230.214.
Is there a way to make sure that Reactor can co-locate things like this?
Perhaps preferred-nodes? https://github.com/LINBIT/drbd-reactor/blob/master/doc/promoter.md#preferred-nodes
Currently all nvme create's will not populate past the first.
The reason being is that the resource-agent responsible for creating the port and linking the subsystems to that port never will reach its code due to "nvmet_port_monitor()".
https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/nvmet-port#L137
The health check only checks the existence of the directory and if so, doesn't iterate over the nqns: https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/nvmet-port#L148
I noticed that we don't populate port_id in "/etc/drbd-reactor.d/linstor-gateway-nvmeof-$name.toml"
We only populate:
Desired behavior? If the user provides a different service address it probably should just automatically take the next available port. This probably should be fixed in linstor-gateway.
If the user provides the same service address it should link it in. This probably should be fixed in resource-agents.
Potentially port_id could be exposed to a user, but probably not necessary.