NetApp / trident

Storage orchestrator for containers
Apache License 2.0
732 stars 218 forks source link

Trident Upgrade 22.10->23.07 multipath device issue #883

Open sjlee-tech opened 5 months ago

sjlee-tech commented 5 months ago

Describe the solution you'd like A clear and concise description of what you want to happen.

The usage environment is k8s: 1.25.6 / trident 22.10 / ubuntu 20.04 LTS. My entire environment has storage 8 nodes /110 worker nodes / 6000 LUNs in total and uses onap-san-economy. I upgraded from 22.10 to 23.07 version. After the upgrade, all LUN belonging to the vserver begin to appear as multipath devices on the server. After this, since the number of LUN is large, the response of multipathd is also slowed, and the number of pods is large, resulting in many errors.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

I tested with the test environment provided by NetApp, and the same phenomenon occurs.

Additional context Add any other context or screenshots about the feature request here.

https://kb.netapp.com/Cloud/Astra/Trident/Error_when_Pods_are_rescheduled_-_error_publishing_ontap-san_driver%3A_problem_mapping_LUN._attr%3A_LUN_already_mapped_to_initiator(s)_in_this_group

It is understood that the ontap-san-economy transitions to per-node-group from 23.07. I don't know under what conditions it is transitioned to per-node-group. In the test environment, we tested about 2 nodes and 3 pods, I did not consistently switch to per-node-group when I tried various things such as server reboot/ pod, pvc deletion and reproduction, but eventually switched to per-node-group. Since my environment is heavy, I need to quickly switch to per-node-group. I need a good way to do an excellent upgrade.

image image

StevenBarre commented 5 months ago

The "iscsi self heal causes all LUNs to be mapped" change was in 23.01 and separate from the "per-node igroup". But the two changes definitely don't work together. We had to downgrade back to 23.01 after a production outage caused by this in October and have had a Sev 1 case open with NetApp since then. Still waiting for a fix.

jwebster7 commented 3 months ago

@sjlee-tech The only supported way to switch to using per-node igroups is to recreate your workloads. Specifically, a PV cannot be in use anywhere in the cluster for a given LUN to be migrated. Per-node igroups were introduced in v23.04.

The team investigated if there was a method to perform an automated LUN remapping. This involves remapping the LUNs to the new per-node igroup, then immediately unmapping the LUN from the old igroup at upgrade time. However, this isn't a supported feature and is a limitation in ONTAP (and most other SAN environments). Specifically, an single given initiator IQN shouldn't be able to access a LUN from two or more distinct igroups. Even if it this was supported, it could open up the backend data to data corruption. As a result, Trident can only map a LUN to a different igroup at CSI ControllerPublishVolume time (when you create a workload w/CSI volume).

The issue of discovering all LUNs has also been investigated and tracked. This is due to an unforeseen incompatibility between iSCSI self-healing and Trident's prior igroup schema. Trident used to use backend-scoped igroups. As a result, when iSCSI detects that it must perform a scan, this exposes all LUNs to a single host.

For now, please disable iSCSI self-healing until all of your workloads consuming Trident-provisioned SAN volumes are using per-node igroups. Some efforts have been tracked to put guardrails in place to avoid these scenarios. They currently do not have an ETA, but I have pushed that work to the appropriate channels for prioritization and planning.

sjlee-tech commented 3 months ago

I understood your comment as " cannot upgrade Trident on my system at this time".

If the version with action plans is released, is it possible to upgrade from my current version of Trident 22.10 at once?

https://github.com/NetApp/trident/issues/864

Does your comment mean to try upgrading after changing the iSCSi self healing disable setting?

I'm afraid because I don't know how it will affect my system after disabling iSCSi self-healing.

vlgenf71 commented 3 months ago

We migrated to trident v23.04 a few weeks ago and we still have only one igroup for all our cluster's workers.

Is there a special config parameter to add to Trident config to activate the new per-node igroups feature ?

that's the only information I found about per-node igroups in Trident doc : All ONTAP-SAN-* volumes will now use per-node igroups. LUNs will only be mapped to igroups while actively published to those nodes to improve our security posture. Existing volumes will be opportunistically switched to the new igroup scheme when Trident determines it is safe to do so without impacting active workloads (Issue #758).

jwebster7 commented 3 months ago

@vlgenf71 Trident v23.04+ will use per-node igroups for ONTAP-SAN-* volumes when you recreate the workloads. Trident cannot remap LUNs to new igroups while those LUNs are currently in use in your Kubernetes cluster. This is a limitation imposed by the backend. There is no way to automatically remap LUNs to use per-node igroups in Trident today. The team ruled against adding that feature because it impacts active workloads, goes outside of CSI for storage backend access, and typically isn't something SCSI systems support.

As a result, Trident v23.04+ will only remap a LUN to a per-node igroup iff the LUN is not in use anywhere in the cluster (the LUN is not attached or mounted to any Kubernetes workload). What this means for you and @sjlee-tech is that you will have to recreate your workloads to utilize per-node igroups.

With regards to iSCSI self-healing @sjlee-tech, there is an incompatibility between this feature and Trident's old igroup scheme. You need to upgrade to a version of Trident that supports per-node igroups with iscsi self-healing disabled, then recreate your workloads consuming ONTAP-SAN-* volumes. If you do this, you should be able to safely re-enable iSCSI self-healing.