ansible-collections / community.vmware

Ansible Collection for VMware
GNU General Public License v3.0
341 stars 333 forks source link

Add module to obtain fcd info (vmware_first_class_disk_info) #1988

Closed gnarc closed 5 months ago

gnarc commented 5 months ago
SUMMARY

We should have a module to obtain info on fcds (first class disks). As of today it is only possible to create and delete them using the vmware_first_class_disk module. Especially deleting might be difficult if we do not have their names. I did not find another way to get this information like any kind of info module for volumes or datastores etc.

ISSUE TYPE
COMPONENT NAME

vmware_first_class_disk_info

ADDITIONAL INFORMATION

The use case is less in creating fcds but rather in doing some housekeeping in your vCenter after a k8s or OpenShift cluster has been deleted and fcds not been cleaned up.

govc can provide the information and achieve the cleanup but I seek a more native way:

# govc volume.ls -l
fade77df-8f6c-431f-af10-c80129403d13    pvc-5fdb4698-1bd6-4d82-9dac-c5eff7ff53b7        80.0GB  KUBERNETES      CLUSTER-A
f5c5ddf9-5298-40f3-9faa-3b6eda429b7f    pvc-e54c88a8-e618-4810-9b92-30ef3a36e71b        96.0MB  KUBERNETES      CLUSTER-A
752e14e3-73c9-4439-ae97-38649bd74587    pvc-f26b1e9b-8d12-4aef-86a5-33f21e206860        96.0MB  KUBERNETES      CLUSTER-B
049ea98c-96aa-4b66-b7c3-72bf4fad41d5    pvc-49eb5811-d782-4900-a307-a552ec1889e5        25.0GB  KUBERNETES      CLUSTER-B
0d3d4f65-81d8-4f4d-b311-aa4b54306054    pvc-cf97889a-149d-46a9-bf5b-cffa52492a26        80.0GB  KUBERNETES      CLUSTER-C

List fcds for Cluster-A:
# for i in $(govc volume.ls -l |grep CLUSTER-A | awk '{print $1}'); do echo $i ; govc volume.ls $i ; done
Remove everything for Cluster-A:
# for i in $(govc volume.ls -l |grep CLUSTER-A | awk '{print $1}'); do echo $i ; govc volume.rm $i ; done

Using govc volume.ls -l -json provides even more details. 
ihumster commented 5 months ago

It is ideologically wrong to do 'state: list' - for this there must be an _info or _fact module. And of course, as always, a maintainer is required for this work.

gnarc commented 5 months ago

Valid points. Changed title and rephrased the initial comment.

Nina2244 commented 5 months ago

@gnarc I hope this module work for you?

gnarc commented 5 months ago

@Nina2244 Thanks for the quick response and code.

I tested with

ansible==9.2.0 ansible-core==2.16.3 pyvmomi==8.0.2.0.1 vsphere: 7.0.3.00700

It does not work out of the box and it fails if I change the datastores:

The module however would not help in my case as we use fcds in the context of Kubernetes/OpenShift, therefore we need at least the k8s cluster name to do a correct mapping of the disk.

Nina2244 commented 5 months ago

@gnarc Thank you for testing and for the annotations.

The code wasn't tested because we don't use this in our environment.

I think I fixed the first, second and third thing. Is it possible that you test again?

I'm not quite sure what is meant by the fourth. That's why I couldn't fix it.

Regarding on what is helping you in your case. It is difficult for me to develop the whole thing because, as already mentioned, we do not have this in use. Maybe there is the possibility to show me what you want. Then maybe I can build it for you.

gnarc commented 5 months ago

@Nina2244 Thanks, the fixes seem to work. Not sure what the consumerId is used for, but since we already have it... We might also add the actual disk id (see first column of the govc output in https://github.com/ansible-collections/community.vmware/issues/1988#issue-2107459690). A line in https://github.com/ansible-collections/community.vmware/blob/d8c5d68d246814d5f1b1e66ce6425f3300afca36/plugins/modules/vmware_first_class_disk_info.py like

114 dids=str(disk.config.id.id),

works for me to get an output like this:

TASK [debug] *****************************************************************************************************
ok: [localhost] => {
    "disk_info.first_class_disks": [
        {
            "consumer_ids": [],
            "consumption_type": [
                "disk"
            ],
            "datastore_name": "ESX04-Storage-01",
            "descriptor_version": null,
            "dids": "932a0c64-2dac-42f8-8c16-5572dce490b8",
            "name": "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
            "size_mb": 1024
        },

I was able to "recover" one of my datastores to get output with our module by doing some cleanup and restarting some OpenShift clusternodes. My assumption here would be that something is blocking or interrupting the API call causing that traceback. There are still two DS in my vsphere w/o fcds which still cause the problem. But since this is a very sloppy, badly patched Lab environment with a lot of users I think further analysis of this has to be on hold for now.

I will upload some info later to make it more clear what I'm still missing in the output.

gnarc commented 5 months ago
  1. So, we should have everything to create/delete basic fcds and getting their info using the two modules community.vmware.vmware_first_class_disk and our new one community.vmware.vmware_first_class_disk_info, e.g.:
    - name: Create Disk fcd w/o k8s (cns)
      community.vmware.vmware_first_class_disk:
        hostname: vcenter.ocp.consol.de
        username: some-user@openshift.local
        password: some-password
        datastore_name: "{{ DSN }}"
        validate_certs: false
        disk_name: 'mytest'
        size: '1GB'
        state: present
      delegate_to: localhost

    Will create

    "disk_info.first_class_disks": [
        {
            "consumer_ids": [],
            "consumption_type": [
                "disk"
            ],
            "datastore_name": "ESX04-Storage-01",
            "descriptor_version": null,
            "dids": "03fb5d4a-58b6-40e4-8540-751ece314a86",
            "name": "mytest",
            "size_mb": 1024
        },

    Using govc we can see it too:

    
    # govc disk.ls -ds ESX04-Storage-01  -l  03fb5d4a-58b6-40e4-8540-751ece314a86
    03fb5d4a-58b6-40e4-8540-751ece314a86  mytest  1.0G  Feb 13 09:52:32

govc disk.ls -ds ESX04-Storage-01 -l -L 03fb5d4a-58b6-40e4-8540-751ece314a86

03fb5d4a-58b6-40e4-8540-751ece314a86 [ESX04-Storage-01] fcd/efc20e83a6234243a59ed73a729dd7f0.vmdk 1.0G Feb 13 09:52:32

govc disk.ls -ds ESX04-Storage-01 -json 03fb5d4a-58b6-40e4-8540-751ece314a86

{ "Objects": [ { "Config": { "Id": { "Id": "03fb5d4a-58b6-40e4-8540-751ece314a86" }, "Name": "mytest", "CreateTime": "2024-02-13T09:52:32.303913Z", "KeepAfterDeleteVm": true, "RelocationDisabled": false, "NativeSnapshotSupported": false, "ChangedBlockTrackingEnabled": false, "Backing": { "Datastore": { "Type": "Datastore", "Value": "datastore-4481" }, "FilePath": "[ESX04-Storage-01] fcd/efc20e83a6234243a59ed73a729dd7f0.vmdk", "BackingObjectId": "", "Parent": null, "DeltaSizeInMB": 0, "KeyId": null, "ProvisioningType": "thin" }, "Metadata": null, "Vclock": null, "Iofilter": null, "CapacityInMB": 1024, "ConsumptionType": [ "disk" ], "ConsumerId": null }, "Tags": null } ] }

How it looks in the UI:
![basic_fcp_UI](https://github.com/ansible-collections/community.vmware/assets/37799887/1e35f6da-4db3-4d3c-87ed-54c757f835f5)

--> However, this fcd will not be listed using `# govc volume.ls -ds ESX04-Storage-01 -l` since it has not been created as CNS (see  [govc volume.ls](https://github.com/vmware/govmomi/blob/main/govc/USAGE.md#volumels))

2. When creating a PVC/PV in OpenShift or k8s we will end up with an fcd plus some Metadata:

oc get pvc

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE humbug Bound pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc 1Gi RWO thin-csi-no-wait 4h28m

oc get pv pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc 1Gi RWO Delete Bound zz-jgeo/humbug thin-csi-no-wait 4h29m

oc get pv pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc -o yaml

apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com volume.kubernetes.io/provisioner-deletion-secret-name: "" volume.kubernetes.io/provisioner-deletion-secret-namespace: "" creationTimestamp: "2024-02-13T09:05:38Z" finalizers:

The fcd can be shown with our module or govc disk.ls...

        {
            "consumer_ids": [],
            "consumption_type": [
                "disk"
            ],
            "datastore_name": "ESX04-Storage-01",
            "descriptor_version": null,
            "dids": "932a0c64-2dac-42f8-8c16-5572dce490b8",
            "name": "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
            "size_mb": 1024
        },
# govc disk.ls -ds ESX04-Storage-01 -l  932a0c64-2dac-42f8-8c16-5572dce490b8
932a0c64-2dac-42f8-8c16-5572dce490b8  pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc  1.0G  Feb 13 09:05:38

# govc disk.ls -ds ESX04-Storage-01 -l -L  932a0c64-2dac-42f8-8c16-5572dce490b8
932a0c64-2dac-42f8-8c16-5572dce490b8  [ESX04-Storage-01] fcd/50f2957d2c564162b2a959b580c88259.vmdk  1.0G  Feb 13 09:05:38

{
  "Objects": [
    {
      "Config": {
        "Id": {
          "Id": "932a0c64-2dac-42f8-8c16-5572dce490b8"
        },
        "Name": "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
        "CreateTime": "2024-02-13T09:05:38.400883Z",
        "KeepAfterDeleteVm": true,
        "RelocationDisabled": false,
        "NativeSnapshotSupported": false,
        "ChangedBlockTrackingEnabled": false,
        "Backing": {
          "Datastore": {
            "Type": "Datastore",
            "Value": "datastore-4481"
          },
          "FilePath": "[ESX04-Storage-01] fcd/50f2957d2c564162b2a959b580c88259.vmdk",
          "BackingObjectId": "",
          "Parent": null,
          "DeltaSizeInMB": 0,
          "KeyId": null,
          "ProvisioningType": "thin"
        },
        "Metadata": null,
        "Vclock": null,
        "Iofilter": null,
        "CapacityInMB": 1024,
        "ConsumptionType": [
          "disk"
        ],
        "ConsumerId": null
      },
      "Tags": null
    }
  ]
}

...but currently only govc with volume.ls will provide the CNS info:

(this is just a pvc w/o consumer i.e. no pod is using it; therefore 
container related info does not exist)

# govc volume.ls -ds ESX04-Storage-01 -l  932a0c64-2dac-42f8-8c16-5572dce490b8
932a0c64-2dac-42f8-8c16-5572dce490b8    pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc        1.0GB   KUBERNETES      dev06-42dmd

# govc volume.ls -ds ESX04-Storage-01 -json  932a0c64-2dac-42f8-8c16-5572dce490b8
{
  "Volume": [
    {
      "VolumeId": {
        "Id": "932a0c64-2dac-42f8-8c16-5572dce490b8"
      },
      "DatastoreUrl": "ds:///vmfs/volumes/63243668-091a5050-9898-84160c1591a8/",
      "Name": "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
      "VolumeType": "BLOCK",
      "StoragePolicyId": "d19711d4-076d-4219-8904-fd11dd44f47e",
      "Metadata": {
        "ContainerCluster": {
          "ClusterType": "KUBERNETES",
          "ClusterId": "dev06-42dmd",
          "VSphereUser": "some-user@openshift.local",
          "ClusterFlavor": "VANILLA",
          "ClusterDistribution": ""
        },
        "EntityMetadata": [
          {
            "EntityName": "humbug",
            "Labels": null,
            "Delete": false,
            "ClusterID": "dev06-42dmd",
            "EntityType": "PERSISTENT_VOLUME_CLAIM",
            "Namespace": "zz-jgeo",
            "ReferredEntity": [
              {
                "EntityType": "PERSISTENT_VOLUME",
                "EntityName": "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
                "Namespace": "",
                "ClusterID": "dev06-42dmd"
              }
            ]
          },
          {
            "EntityName": "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
            "Labels": null,
            "Delete": false,
            "ClusterID": "dev06-42dmd",
            "EntityType": "PERSISTENT_VOLUME",
            "Namespace": "",
            "ReferredEntity": null
          }
        ],
        "ContainerClusterArray": [
          {
            "ClusterType": "KUBERNETES",
            "ClusterId": "dev06-42dmd",
            "VSphereUser": "some_user@openshift.local",
            "ClusterFlavor": "VANILLA",
            "ClusterDistribution": ""
          }
        ]
      },
      "BackingObjectDetails": {
        "CapacityInMb": 1024,
        "BackingDiskId": "932a0c64-2dac-42f8-8c16-5572dce490b8",
        "BackingDiskUrlPath": "",
        "BackingDiskObjectId": ""
      },
      "ComplianceStatus": "compliant",
      "DatastoreAccessibilityStatus": "accessible",
      "HealthStatus": "green"
    }
  ]
}

The fcd folder contains the vmdk plus a vmfd file which contains the metadata:

cns_fcp_UI_1

which (probably) will be used in providing the info under DS -> Monitor -> Cloud Native Storage: Container volumes:

cns_fcp_UI_2 cns_fcp_UI_3 cns_fcp_UI_4

So, I believe if we want our module to be complete, we should be able to retrieve the CNS information as well. With the additional (CNS) information we could process fcds created by OpenShift or k8s by using filters based on the kubernetes metadata.

Nina2244 commented 5 months ago

What CNS information were you thinking of exactly?

It's very difficult for me to understand the connections because we don't use first class disks.

gnarc commented 5 months ago

For now, just the Kubernetes Cluster/ClusterId would be fine.

Nina2244 commented 5 months ago

Do you know the code capture feature? You would do me a huge favour if you could record what you sent me last, i.e. the pictures, with the code capture feature and send it to me as python code.

Maybe the RetrieveVStorageObjectAssociatons Method ist the rigth to get the infos you want but. I can´t test ist.

gnarc commented 5 months ago

code capture found. But there are sections in the UI where nothing gets recorded - like in the CNS/Container volumes section.

Out of desperation here is the verbose output of govc for a single fcd which might give some hints. Maybe the CnsQueryVolume from the VSAN mgmt API Ref -> CnsVolumeManager might help

# govc volume.ls -l -verbose  932a0c64-2dac-42f8-8c16-5572dce490b8 -h
CnsQueryVolume(cns-volume-manager, types.CnsQueryFilter)...
...types.CnsQueryResult{
    Volumes: []types.CnsVolume{
        {
            VolumeId: types.CnsVolumeId{
                Id: "932a0c64-2dac-42f8-8c16-5572dce490b8",
            },
            DatastoreUrl:    "ds:///vmfs/volumes/63243668-091a5050-9898-84160c1591a8/",
            Name:            "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
            VolumeType:      "BLOCK",
            StoragePolicyId: "d19711d4-076d-4219-8904-fd11dd44f47e",
            Metadata:        types.CnsVolumeMetadata{
                ContainerCluster: types.CnsContainerCluster{
                    ClusterType:         "KUBERNETES",
                    ClusterId:           "dev06-42dmd",
                    VSphereUser:         "some-user@openshift.local",
                    ClusterFlavor:       "VANILLA",
                },
                EntityMetadata: []types.BaseCnsEntityMetadata{
                    &types.CnsKubernetesEntityMetadata{
                        CnsEntityMetadata: types.CnsEntityMetadata{
                            EntityName: "humbug",
                            Delete:     false,
                            ClusterID:  "dev06-42dmd",
                        },
                        EntityType:     "PERSISTENT_VOLUME_CLAIM",
                        Namespace:      "zz-jgeo",
                        ReferredEntity: []types.CnsKubernetesEntityReference{
                        },
                    },
                    &types.CnsKubernetesEntityMetadata{
                        CnsEntityMetadata: types.CnsEntityMetadata{
                            EntityName: "pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc",
                            Delete:     false,
                            ClusterID:  "dev06-42dmd",
                        },
                        EntityType:     "PERSISTENT_VOLUME",
                    },
                },
                ContainerClusterArray: []types.CnsContainerCluster{
                    {
                        ClusterType:         "KUBERNETES",
                        ClusterId:           "dev06-42dmd",
                        VSphereUser:         "srv-admin@openshift.local",
                        ClusterFlavor:       "VANILLA",
                    },
                },
            },
            BackingObjectDetails: &types.CnsBlockBackingDetails{
                CnsBackingObjectDetails: types.CnsBackingObjectDetails{
                    CapacityInMb: 1024,
                },
                BackingDiskId:       "932a0c64-2dac-42f8-8c16-5572dce490b8",
            },
            ComplianceStatus:             "compliant",
            DatastoreAccessibilityStatus: "accessible",
            HealthStatus:                 "green",
        },
    },
    Cursor: types.CnsCursor{
        Offset:       1,
        Limit:        100,
        TotalRecords: 1,
    },
}

932a0c64-2dac-42f8-8c16-5572dce490b8    pvc-ad66f834-5e4c-4d88-a05c-3cfbebfb24cc        1.0GB   KUBERNETES      dev06-42dmd
Nina2244 commented 5 months ago

I have a question of understanding. A first class disc is added to / created within a datastore. But I still don't understand how this relates to Cloud Native Storage and Container Volumes. Are there always container volumes with first class discs or is that the case with you but there can also be without?

I think I have found what we are looking for in the VSAN Mgmt API: CnsQueryVolume This method can be given the volumeIds with a filter. Which we already have.

gnarc commented 5 months ago

I'm not so deep into vSphere myself but from my understanding and the test above I'd say fcd is a generic way of creating a disk w/o a VM and CNS is just a fcd created by something more cloud native, e.g. K8s or OpenShift. See if those docs can clear it up for you: