IBM / ibm-block-csi-driver

The IBM block storage CSI driver enables container orchestrators, such as Kubernetes and Openshift, to manage the life-cycle of persistent storage
Apache License 2.0
33 stars 25 forks source link

"malformed node or string" on host-definer whith topology-awareness enabled #662

Open kolovo opened 1 year ago

kolovo commented 1 year ago

Hi, The host-definer fails to read the secret with topology awareness and as a result host ports cannot be dynamically configured on the storage array.

The Secret is created according to official documentation: https://www.ibm.com/docs/en/stg-block-csi-driver/1.11.0?topic=topology-creating-secret-awareness

Error message on host-definer: 2023-03-15 14:53:26,796 INFO [139832879761152] [Thread-5] (manager.py:_get_secret_data:293) - Reading secret ibm-nvme-topo-secret in namespace ibm-csi Exception in thread Thread-5: Traceback (most recent call last): File "/usr/lib64/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/usr/lib64/python3.8/threading.py", line 870, in run self._target(*self._args, self._kwargs) File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 29, in watch_storage_class_resources secrets_info = self._get_secrets_info_from_storage_class_with_driver_provisioner(storage_class_info) File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 38, in _get_secrets_info_from_storage_class_with_driver_provisioner return self._get_secrets_info_from_storage_class(storage_class_info) File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 49, in _get_secrets_info_from_storage_class secret_data = self._get_secret_data(secret_name, secret_namespace) File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 295, in _get_secret_data return self._change_decode_base64_secret_config(secret_data) File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 305, in _change_decode_base64_secret_config secret_data[settings.SECRET_CONFIG_FIELD] = self._decode_base64_to_dict( File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 313, in _decode_base64_to_dict my_dict_again = ast.literal_eval(base64.b64decode(base64_dict)) File "/usr/lib64/python3.8/ast.py", line 99, in literal_eval return _convert(node_or_string) File "/usr/lib64/python3.8/ast.py", line 98, in _convert return _convert_signed_num(node) File "/usr/lib64/python3.8/ast.py", line 75, in _convert_signed_num return _convert_num(node) File "/usr/lib64/python3.8/ast.py", line 66, in _convert_num _raise_malformed_node(node) File "/usr/lib64/python3.8/ast.py", line 63, in _raise_malformed_node raise ValueError(f'malformed node or string: {node!r}') ValueError: malformed node or string: b' {\n "dev-management-id-2": {\n "username": "demo",\n "password": "demo",\n "management_address": "192.168.1.11",\n "supported_topologies": [\n {\n "topology.block.csi.ibm.com/dc-region": "demo",\n "topology.block.csi.ibm.com/dc-zone": "demo-1"\n }\n ]\n },\n "dev-management-id-1": {\n "username": "demo",\n "password": "demo",\n "management_address": "192.168.1.10",\n "supported_topologies": [\n {\n "topology.block.csi.ibm.com/dc-region": "demo2",\n "topology.block.csi.ibm.com/dc-zone": "demo2-1"\n }\n ]\n }\n }\n'**

Error message on scheduled pod : Normal Scheduled 79s default-scheduler Successfully assigned default/task-pv-pod to mighty-ewe Warning FailedAttachVolume 10s (x8 over 78s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-4a11b9f7-c064-44dd-940b-c942f3d75dfd" : rpc error: code = NotFound desc = Host for node: Initiators(nvme_nqns=['nqn.2222-11.org.nvmexpress:uuid:82a355e4-f2ee-adsd-asdd-2385ec84ec2b'], fc_wwns=['101110234bdf67b8', '100000109bdfbc24'], iscsi_iqns=[]) was not found, ensure all host ports are configured on storage

From the error message its clear that the ast.literal_eval fuction receives as argument a byte data type and not a string data type that contains a byte data type. According to documentation the function works only with strings or node expressions.

The version that it is used is 1.10.0

Thank you

kasserater commented 2 weeks ago

does this still happen with latest 1.11.3? we were unable to reproduce this issue in our lab