Juniper / ansible-junos-stdlib

Junos modules for Ansible
Apache License 2.0
303 stars 157 forks source link

Timeout value for rpc timeout not used #607

Closed T0nip closed 4 months ago

T0nip commented 1 year ago

Issue Type

Module Name

juniper.device.config

juniper.device collection and Python libraries version

ansible --version
ansible [core 2.14.1]
  config file = None
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.9/dist-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] (/usr/bin/python3)
  jinja version = 3.1.2
  libyaml = True
pip freeze
ansible==7.1.0
ansible-core==2.14.1
ansible-pylibssh==1.1.0
bcrypt==4.0.1
cffi==1.15.1
cryptography==39.0.0
Jinja2==3.1.2
junos-eznc==2.6.6
jxmlease==1.0.3
lxml==4.9.2
MarkupSafe==2.1.2
ncclient==0.6.13
netaddr==0.8.0
packaging==23.0
paramiko==3.0.0
pycparser==2.21
PyNaCl==1.5.0
pyparsing==3.0.9
pyserial==3.5
PyYAML==6.0
resolvelib==0.8.1
scp==0.14.4
six==1.16.0
supervisor==4.2.2
transitions==0.9.0
xmltodict==0.13.0
yamlordereddictloader==0.4.0

OS / Environment

Juniper srx300 21.2R3-S2.9 Summary

The provided value for timeout isn't used in modules, so the playbook fails while commiting changes.

Steps to reproduce

Commit changes that take longer than 30s.

        - name: install new config
          config:
            host: "{{ inventory_hostname }}"
            port: "{{ sshport }}"
            user: "{{ username }}"
            passwd: "{{ password }}"
            load: 'set'
            src: "{{ config_path }}{{ config_file }}"
            logfile: "junos.log"
            confirmed: 20
            timeout: 120

Expected results

timeout value should be used, so setting 120s should adjust the timeout used, to 120s.

Actual results

playbook stops with timeout error after 30s

IvanBCode commented 1 year ago

the same issue is there any workaround?

m0msg commented 1 year ago

i have the same issue timeout is 180 but the default value is used

fatal: [xxx]: FAILED! => {"changed": false, "msg": "Failure committing the configuraton: RpcTimeoutError(host: xxx, cmd: commit-configuration, timeout: 30)"}

fatal: [xxx]: FAILED! => { "changed": false, "invocation": { "module_args": { "attempts": null, "baud": null, "check": null, "check_commit_wait": null, "comment": "Full Config Update via \"junos_fw_commit_auto.yml\"", "commit": null, "commit_empty_changes": false, "commit_force_sync": false, "commit_full": false, "commit_sync": false, "config_mode": "private", "confirm": 10, "confirmed": 10, "console": null, "cs_passwd": null, "cs_user": null, "dest": null, "dest_dir": null, "diff": null, "diffs_file": null, "filter": null, "format": "set", "host": "xxx", "huge_tree": false, "ignore_warning": null, "level": null, "lines": null, "load": "set", "logdir": null, "logfile": null, "mode": null, "model": null, "namespace": null, "options": {}, "passwd": null, "port": 830, "remove_ns": null, "retrieve": null, "return_output": true, "rollback": null, "src": "../../output/xxx", "ssh_config": null, "ssh_private_key_file": "xxx", "template": null, "timeout": 180, "url": null, "user": "root", "vars": null } }, "msg": "Failure committing the configuraton: RpcTimeoutError(host: xxx, cmd: commit-configuration, timeout: 30)" }

Darkuja commented 1 year ago

Same here. Fortunately, I have a dev server where it's taken into account. The only difference is juniper.device runs in 1.0.1.

Force install 1.0.1 solve the issue : ansible-galaxy collection install juniper.device:==1.0.1

cbutera-sqsp commented 1 year ago

Ran into same issue. Putting timeout value in quotes fixed it for me.

juniper.device.config:
  port: "830"
  timeout: "45"
  load: merge

I noticed similar behavior with the port setting. Although I had defined 830 it was still opening a connection over port 22. After putting the port value in quotes it started using port 830 as I had defined.

Its confusing because the documentation shows timeout value as an int and port value as int or str

jnicholson56 commented 12 months ago

I too have been running into this and it seems the only function of the task that does not use the provided timeout is the commit function. Adding the timeout value in quotes as previously described did not help me.

Every other step in the config task from Open a candidate configuration database to Retrieve the configuration database from the Junos device uses the provided timeout value from the Ansible task. The commit step does not. From this output snippet, using -vvvv during the running of the task, we can see the timeout is set to the one I defined in Ansible (300), but the default is still 30 seconds.

....

            "timeout": 300,    <---------------- Timeout defined by me in task code
            "url": null,
            "user": "sanitized",
            "vars": null
        }
    },
    "msg": "Failure committing the configuraton: RpcTimeoutError(host: router_name, cmd: commit-configuration, timeout: 30)"  <-------------- default in use
}

Potential Bug:

I looked through the module code and I think I identified a potential bug.

In file https://github.com/Juniper/ansible-junos-stdlib/blob/master/ansible_collections/juniper/device/plugins/module_utils/juniper_junos_common.py

The function name is commit_configuration.

The start of the function sets the default timeout to 30:

   def commit_configuration(self, ignore_warning=None, comment=None,
                             confirmed=None, timeout=30, full=False,
                             sync=False, force_sync=False):

The commit is called later in the function and shows timeout=timeout:

        try:
            self.config.commit(ignore_warning=ignore_warning,
                               comment=comment,
                               confirm=confirmed,
                               timeout=timeout,
                               full=full,
                               force_sync=force_sync,
                               sync=sync)
            self.logger.debug("Configuration committed.")
        except (self.pyez_exception.RpcError,
                self.pyez_exception.ConnectError) as ex:
            self.fail_json(msg='Failure committing the configuraton: %s' %
                               (str(ex)))

Comparing this to previous functions, there never seems to be an instance where the commit_configuration calls the user defined timeout. Previous functions have the following in use and I believe a call to a user defined timeout if it exists is needed to resolve this issue.

Example from the open function:

self.dev.timeout = timeout

*** Edited this section to provide the correct code I meant to copy

The commit_configuration just always calls it's own default.

Workaround:

ncclient times out at 30 seconds, but the router continues to commit fine so the configuration changes do go through. I worked around this by putting each config change in it's own block and adding a rescue that was a copy of the original task.

Ansible's second attempt sees the config is there and moves right along with the next task fine. I have upgraded ~10-15 routers this way now and it appears to be a valid workaround.

Example:

- name: Disabling all procotols block
  block:
    - name: Disable protocols
      juniper.device.config:
        load: "set"
        lines:
          - "deactivate protocols ldp"
          - "deactivate protocols l2circuit"
          - "deactivate protocols mpls"
          - "deactivate protocols rsvp"
        ignore_warning: true
        commit: true
        timeout: 300
        level: DEBUG
        logfile: "{{ logfile }}"

  rescue:
    - name: Disable protocols rescue
      juniper.device.config:
        load: "set"
        lines:
          - "deactivate protocols ldp"
          - "deactivate protocols l2circuit"
          - "deactivate protocols mpls"
          - "deactivate protocols rsvp"
        ignore_warning: true
        commit: true
        timeout: 300
        level: DEBUG
        logfile: "{{ logfile }}"
jnicholson56 commented 12 months ago

I was able to do a quick and dirty update to the juniper_junos_common.py and it seems to resolve the issue.

I added the following code to the _commitconfiguration function:

        if self.dev.timeout:
            timeout = self.dev.timeout

Full function code:

    def commit_configuration(
        self,
        ignore_warning=None,
        comment=None,
        confirmed=None,
        timeout=30,
        full=False,
        sync=False,
        force_sync=False,
    ):
        """Commit the candidate configuration.

        Commit the configuration. Assumes the configuration is already opened.

        Args:
            ignore_warning - Which warnings to ignore.
            comment - The commit comment
            confirmed - Number of minutes for commit confirmed.
            timeout - Timeout for commit configuration. Default timeout value is 30s.
            full - apply full commit
            sync - Check for commit syntax and sync between RE's
            force_sync - Ignore syntax check and force to sync between RE's

        Failures:
            - An error returned from committing the configuration.
        """

        if self.dev.timeout:
            timeout = self.dev.timeout

        if self.conn_type != "local":
            self._pyez_conn.commit_configuration(
                ignore_warning, comment, timeout, confirmed
            )
            return

        if self.dev is None or self.config is None:
            self.fail_json(msg="The device or configuration is not open.")

        self.logger.debug("Committing the configuration.")
        try:
            self.config.commit(
                ignore_warning=ignore_warning,
                comment=comment,
                confirm=confirmed,
                timeout=timeout,
                full=full,
                force_sync=force_sync,
                sync=sync,
            )
            self.logger.debug("Configuration committed.")
        except (self.pyez_exception.RpcError, self.pyez_exception.ConnectError) as ex:
            self.fail_json(msg="Failure committing the configuraton: %s" % (str(ex)))
mpc755 commented 10 months ago

Confirmed fix: Before adding fix: TASK [configuration settings] ******************************************************************************************************************* fatal: [ex2300]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python3"}, "changed": false, "msg": "Failure committing the configuraton: RpcTimeoutError(host: <snip>, cmd: commit-configuration, timeout: 30)"}

After adding fix (with "timeout: '60'"): TASK [configuration settings] ****************************************************************************************************************************************************** changed: [ex2300]

vishnubraj commented 9 months ago

@chidanandpujar Can we fix this in the main code? The fix mentioned by @jnicholson56 is working fine for us.

vishnubraj commented 8 months ago

@chidanandpujar @jnicholson56 Can we create a PR for this?

jnicholson56 commented 8 months ago

@vishnubraj I thought you were with juniper :D I am not against that but it seems like maybe no one is working this case actively. I am also unsure of the release cadence for this ansible package.

chidanandpujar commented 4 months ago

Hi @T0nip Fix is merged now https://github.com/Juniper/ansible-junos-stdlib/pull/665 hence closing this issue, Please verify and update the results .

Thanks Chidanand