F5Networks / f5-ansible

Imperative Ansible modules for F5 BIG-IP products
GNU General Public License v3.0
375 stars 229 forks source link

bigip_configsync_action with overwrite_config: yes can return "Recommended action: Synchronize to group #2378

Closed jmcguir closed 6 months ago

jmcguir commented 11 months ago

COMPONENT NAME bigip_configsync_action

Environment ANSIBLE VERSION

ansible-playbook 2.9.11
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/var/lib/awx/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.6/site-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.6.8 (default, Apr 16 2020, 01:36:27) [GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]

BIGIP VERSION

bigip:15.1.10.2

OS / ENVIRONMENT

'f5networks.f5_modules:1.26.0'

SUMMARY Checking https://github.com/ansible/ansible_collections_f5/blob/master/plugins/modules/bigip_configsync_action.py

I would expect that bigip_configsync_action with overwrite_config: yes would be equivalent to force-full-load-push so when running the following code:

I would never get the following message:

{
    "msg": "Recommended action: Synchronize this device to group F5-LTM-PAIR-GROUP",
    "invocation": {
        "module_args": {
            "device_group": "F5-LTM-PAIR-GROUP",
            "sync_device_to_group": true,
            "overwrite_config": true,
            "provider": {
                "server": "F5-LTM-PAIR1",
                "user": "USERNAME",
                "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
                "validate_certs": false,
                "server_port": 443,
                "transport": "rest",
                "timeout": null,
                "no_f5_teem": false,
                "auth_provider": null
            },
            "sync_group_to_device": null
        }
    },
    "_ansible_no_log": false,
    "changed": false
}

How is that this can happen? How this could be made more reliable?

STEPS TO REPRODUCE Configure two units with basic config (time sync'ed) Configure services in one of the units. Save the config Perform sync to the device-group as above

EXPECTED RESULTS The cluster is always in sync when using this option. The condition "Recommended action: Synchronize xxx to group lb-cluster" is not handled by _wait_for_sync

I tried to reopen https://github.com/F5Networks/f5-ansible/issues/2065 but I couldn't. This seems to be the same issue. I don't understand why it's closed.

pgouband commented 10 months ago

Hi,

Thanks for reporting. Added to the backlog and internal tracking ID for this request is: INFRAANO-433.

urohit011 commented 10 months ago

Hello @jmcguir , I ran the task you shared on bigip 15, and it worked fine in my case. I didn't get the "Recommended action: Synchronize this device to group F5-LTM-PAIR-GROUP" message.

jmcguir commented 10 months ago

Hey @urohit011, and you made a change on the standby F5 and then tried to sync from the standby to the device group?

urohit011 commented 10 months ago

Hi @jmcguir , I tried from the standby device and again it worked fine without issues.

pgouband commented 10 months ago

Hi @jmcguir,

We think it will be more efficient to open a case via https://my.f5.com/ so support can check if the sync issue is related to the BIG-IP config and we can ask you to provide a qkview, logs and ucs.

jmcguir commented 9 months ago

@urohit011 and @pgouband you are following the steps to reproduce including saving the config?

I can consistently make this happen.

I believe this function is the issue: def _validate_pending_status(self, details) https://github.com/F5Networks/f5-ansible/blob/68124ba2bff5fa20c6f383821b3a61756bea2f0e/ansible_collections/f5networks/f5_modules/plugins/modules/bigip_configsync_action.py#L365C13-L365C13

I mean it already says its a hack in the comment the line below :).

Because when I curl https:///f5-alb01.test.com/mgmt/tm/cm/sync-status

I get back this:

"kind": "tm:cm:sync-status:sync-statusstats",
"selfLink": "https://localhost/mgmt/tm/cm/sync-status?ver=17.1.1",
"entries": {
    "https://localhost/mgmt/tm/cm/sync-status/0": {
        "nestedStats": {
            "entries": {
                "color": {
                    "description": "red"
                },
                "https://localhost/mgmt/tm/cm/syncStatus/0/details": {
                    "nestedStats": {
                        "entries": {
                            "https://localhost/mgmt/tm/cm/syncStatus/0/details/0": {
                                "nestedStats": {
                                    "entries": {
                                        "details": {
                                            "description": "f5name.com: connected"
                                        }
                                    }
                                }
                            },
                            "https://localhost/mgmt/tm/cm/syncStatus/0/details/1": {
                                "nestedStats": {
                                    "entries": {
                                        "details": {
                                            **"description": "F5-DEVICE-GROUP (Changes Pending): There is a possible change conflict between f5-alb01.test.com and f5-alb02.test.com."**
                                        }
                                    }
                                }
                            },
                            "https://localhost/mgmt/tm/cm/syncStatus/0/details/2": {
                                "nestedStats": {
                                    "entries": {
                                        "details": {
                                            **"description": " - Recommended action: Synchronize f5-alb01.test.com to group F5-DEVICE-GROUP"**
                                        }
                                    }
                                }
                            }
                        }
                    }
                },
                "mode": {
                    "description": "high-availability"
                },
                "status": {
                    "description": "Changes Pending"
                },
                "summary": {
                    "description": "There is a possible change conflict between f5-alb01.test.com. and f5-alb02.test.com."
                }
            }
        }
    }
}

}

You can see that "description": " - Recommended action: Synchronize f5-alb01.test.com to group F5-DEVICE-GROUP" is present where validate_pending_status is looking for it. This having the effect of causing the job to error out. For reference I'm running the sync on the active (f5-alb02).

The fix would be to either wait for a bit longer for the device to sync before polling or to handle that if we pass overwrite that we don't care about this particular error: Recommended action: Synchronize f5-alb01.test.com to group F5-DEVICE-GROUP". Since we are obviously overwriting. I mean think about it logically. Does that recommended action make any sense in comparison to what I'm trying to do? I'm telling the f5 to ignore anything and overwrite the config.

I don't see how a config item would be effecting this. Please look at https://github.com/F5Networks/f5-ansible/issues/2065 for further evidence of this.

urohit011 commented 9 months ago

Hi @jmcguir , I followed all the steps you mentioned but I didn't see the error. But, about saving the config could you tell me how do you go about saving the config I guess it happens automatically in the GUI, or are you doing it differently, or am I missing something?

jmcguir commented 8 months ago

Hey @urohit011 I'm saving the config with ansible

The key being after the save on one side there is a config difference making the pair out of sync.

urohit011 commented 8 months ago

Could you please provide the entire playbook you're running, @jmcguir? Thanks

jmcguir commented 8 months ago

I can't share the whole playbook publicly as it has employer specific details. It's 1000's of lines long and is made up of python and ansible. I've opened case 00550506 @urohit011

pgouband commented 7 months ago

Hi @jmcguir,

We are able to access to the file you uploaded but we think it's better to open a case and to ask to escalate it to Ansible dev team so we can discuss via support as we may need some information. I apologize if there was a misunderstanding about the case. Could you open a new case and ask to escalate it to Ansible dev team?

jmcguir commented 7 months ago

Hi @pgouband I've reopened the case and escalated it to the ansible dev team.

urohit011 commented 7 months ago

Hi @jmcguir , I tried running the playbook without using the vault protected var files and the roles and the configsync_action task ran fine without any issue.

jmcguir commented 7 months ago

Okay now uncomment lines 363 - 368 in upgrade_bigip.yml and report back to me please.

Did you look at this https://github.com/F5Networks/f5-ansible/issues/2378#issuecomment-1863439650?

I linked some source code that should inform the issue.

urohit011 commented 7 months ago

@jmcguir Working on it

pgouband commented 6 months ago

Hi,

We tried to reproduce the issue in our lab. Also support team tried too without success and there was no response from you. The issue is maybe in your environment. Please reopen a support case if the issue is still occurring.