ansible-collections / netapp.ontap

Ansible collection to support NetApp ONTAP configuration.
https://galaxy.ansible.com/netapp/ontap
GNU General Public License v3.0
57 stars 37 forks source link

breaking snapmirror throws error "entry doesn't exist" #77

Closed christian-naenny closed 2 years ago

christian-naenny commented 2 years ago

Summary

Sample code for the snapmirror operation:

- name: "change the snapmirror state to {{ state }}"
  netapp.ontap.na_ontap_snapmirror:
    state: "present"
    relationship_state: "{{ state }}"
    source_endpoint: 
      path:    "{{ svm + ':' }}"
    destination_endpoint:
      path:    "{{ dr_svm + ':' }}"
    hostname: "{{ cluster_hostname }}"
    username: "{{ admin_username }}"
    password: "{{ admin_password }}"
    validate_certs: no
    use_rest: always

When I set the relationship_state to "broken" the playbooks throws the error: "msg": "calling: snapmirror/relationships/5c9b50b5-741b-11ea-a1dc-00a098d90e1d: got {'message': \"entry doesn't exist\", 'code': '4', 'target': 'uuid'}."

But after the execution and the error message the DR SVM reports a mirror state of "snapmirrored" and a relationship status of "quiesced"!

When I set the relationship_state to "active" for the next run of the playbook, it works just fine and the DR SVM reports a mirror state of "snapmirrored" and a relationship status of "idle"!

Component Name

pip

Ansible Version

ansible [core 2.12.4]
  config file = /home/chn/.ansible/ansible.cfg
  configured module search path = ['/home/chn/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /opt/takeover/venvs/to_devel_py38_ans212/lib64/python3.8/site-packages/ansible
  ansible collection location = /home/chn/.ansible/collections:/usr/share/ansible/collections
  executable location = /opt/takeover/venvs/to_devel_py38_ans212/bin/ansible
  python version = 3.8.12 (default, Sep 16 2021, 10:46:05) [GCC 8.5.0 20210514 (Red Hat 8.5.0-3)]
  jinja version = 3.0.3
  libyaml = True

ONTAP Collection Version

netapp.aws                    21.7.0
netapp.azure                  21.10.0
netapp.cloudmanager           21.15.0
netapp.elementsw              21.7.0
netapp.ontap                  21.17.3
netapp.storagegrid            21.10.0
netapp.um_info                21.8.0
netapp_eseries.santricity     1.3.0

ONTAP Version

NetApp Release 9.8P9: Wed Dec 22 09:39:04 UTC 2021

Playbook

- name: "change the snapmirror state to {{ state }}"
  netapp.ontap.na_ontap_snapmirror:
    state: "present"
    relationship_state: "{{ state }}"
    source_endpoint: 
      path:    "{{ svm + ':' }}"
    destination_endpoint:
      path:    "{{ dr_svm + ':' }}"
    hostname: "{{ cluster_hostname }}"
    username: "{{ admin_username }}"
    password: "{{ admin_password }}"
    validate_certs: no
    use_rest: always

Steps to Reproduce

  1. power off the source SVM on the primary cluster
  2. ansible-playbook ./snapmirror.yml -e '{ "svm": "svmdczh97", "state": "broken" }'

Expected Results

I expect the snapmirror state of the DR SVM to be changed to "broken off" and the relationship status to "idle".

Actual Results

TASK [change the snapmirror state to broken] *************************************************************************************************************************************************************************************************************************************************
task path: /home/chn/git/datacenter-takeover-ansible/svm/snapmirror.yml:50
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: chn
<127.0.0.1> EXEC /bin/sh -c 'echo ~chn && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/chn/.ansible/tmp `"&& mkdir "` echo /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071 `" && echo ansible-tmp-1655131372.0626814-1519985-17167292290071="` echo /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071 `" ) && sleep 0'
Using module file /home/chn/.ansible/collections/ansible_collections/netapp/ontap/plugins/modules/na_ontap_snapmirror.py
<127.0.0.1> PUT /home/chn/.ansible/tmp/ansible-local-15199609vadxkhq/tmpm5mf7tht TO /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071/AnsiballZ_na_ontap_snapmirror.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071/ /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071/AnsiballZ_na_ontap_snapmirror.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/opt/takeover/venvs/to_devel_py38_ans212/bin/python3.8 /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071/AnsiballZ_na_ontap_snapmirror.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /home/chn/.ansible/tmp/ansible-tmp-1655131372.0626814-1519985-17167292290071/ > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "cert_filepath": null,
            "connection_type": "ontap_ontap",
            "create_destination": null,
            "destination_cluster": null,
            "destination_endpoint": {
                "cluster": null,
                "consistency_group_volumes": null,
                "ipspace": null,
                "path": "dr-svmdczh97:",
                "svm": null
            },
            "destination_path": null,
            "destination_volume": null,
            "destination_vserver": null,
            "feature_flags": {},
            "hostname": "nasdcbe01",
            "http_port": null,
            "https": false,
            "identity_preserve": null,
            "initialize": true,
            "key_filepath": null,
            "max_transfer_rate": null,
            "ontapi": null,
            "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "policy": null,
            "relationship_info_only": false,
            "relationship_state": "broken",
            "relationship_type": null,
            "schedule": null,
            "source_cluster": null,
            "source_endpoint": {
                "cluster": null,
                "consistency_group_volumes": null,
                "ipspace": null,
                "path": "svmdczh97:",
                "svm": null
            },
            "source_hostname": null,
            "source_password": null,
            "source_path": null,
            "source_snapshot": null,
            "source_username": null,
            "source_volume": null,
            "source_vserver": null,
            "state": "present",
            "update": true,
            "use_rest": "always",
            "username": "ansitako",
            "validate_certs": false
        }
    },
    "msg": "calling: snapmirror/relationships/5c9b50b5-741b-11ea-a1dc-00a098d90e1d: got {'message': \"entry doesn't exist\", 'code': '4', 'target': 'uuid'}."
}

PLAY RECAP ***********************************************************************************************************************************************************************************************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
lonico commented 2 years ago

The error {'message': \"entry doesn't exist\", 'code': '4', 'target': 'uuid'} is coming from ONTAP. Before attempting to break the relationship, it is put in quiesced state. So it seems the REST API call to quiesce the relationship was successful, but the call to break it failed.

We made a change in 21.20.0 to make sure to wait for the relationship to be in quiesced state before sending the break request. Before this, when using REST, we would immediately send the break request. Could you give it a try?

christian-naenny commented 2 years ago

I'm sorry, I don't quite understand yet. What should I give a try?

Here's where I'm coming from: I already solved the problem of breaking the the snapmirror relationship using a script that uses the ONTAP CLI. I already have the code to quiesce the scheduler and then to break the snapmirror relationship after the status is "quiesced, idle". But now I would like to use the Ansible module to replace my custom code. Is there a different way of doing this? Is there a parameter I missed? I was expecting the module to actually quiesce the scheduler and then if the schedule has been quiesced to break the snapmirror relationship. What "entry" is ONTAP missing here?

lonico commented 2 years ago

Sorry, I meant upgrading your ONTAP collection from 21.17.3 to 21.20.0.

christian-naenny commented 2 years ago

Ah yes, of course! I will update the ONTAP collection and try again!

christian-naenny commented 2 years ago

Sorry, even after updating the netapp.ontap collection to version 21.20.0 I get the same "entry doesn't exist" error when I try to break the snapmirror connection.

$ ansible-galaxy collection list netapp.ontap
# /home/chn/.ansible/collections/ansible_collections
Collection   Version
------------ -------
netapp.ontap 21.20.0

While running the playbook: Using module file /home/chn/.ansible/collections/ansible_collections/netapp/ontap/plugins/modules/na_ontap_snapmirror.py

lonico commented 2 years ago

So far, we're not able to reproduce the issue in our labs. Still trying.

2 questions

  1. power off the source SVM on the primary cluster - what do you mean by power off here?
  2. With 21.20.0, do you see the same error message as shown above, or is it slightly different?
christian-naenny commented 2 years ago

Sorry for the delay. I had a few days of vacation...

Here are the answers to your questions:

  1. power off basically means stop the SVM (vserver stop -vserver $svmName)
  2. It throws a slightly different error: "msg": "Error patching SnapMirror: {'state': 'broken_off'} - calling: snapmirror/relationships/5c9b50b5-741b-11ea-a1dc-00a098d90e1d: got {'message': \"entry doesn't exist\", 'code': '4', 'target': 'uuid'}."
christian-naenny commented 2 years ago

Just to be sure, these are the the prerequisite packages in the venv I'm using to run the playbook:

ansible      5.6.0
ansible-core 2.12.4
netapp-lib   2021.6.25
requests     2.27.1
lonico commented 2 years ago

Yes, I assumed you were doing a vserver stop. I also assumed you are using two clusters.

The new error message also confirms you are using the latest collection, and the location where the issue is reported from.

This is puzzling, I could not reproduce the issues on 9.11 or 9.8. Since we're using the same UUID in the quiesce and break requests, it's surprising the first one succeeds and the second one fails.

I'll try with 9.8P9.

carchi8py commented 2 years ago

@christian-naenny let us know if your still having the issue with 9.8p9 if so please reopen the issue.