Juniper / ansible-junos-stdlib

Junos modules for Ansible
Apache License 2.0
306 stars 158 forks source link

Juniper_junos_software ISSU error for SRX #331

Open stuartianbrown opened 6 years ago

stuartianbrown commented 6 years ago

When attempting an ISSU using the module juniper_junos_software on an SRX550 an error is thrown for the number of routing engines:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TypeError: ISSU/NSSU requires Multi RE setup
fatal: [LAB-SRX550-node0]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_lYr78N/ansible_module_juniper_junos_software.py\", line 788, in <module>\n    main()\n  File \"/tmp/ansible_lYr78N/ansible_module_juniper_junos_software.py\", line 701, in main\n    ok = junos_module.sw.install(**install_params)\n  File \"/usr/lib/python2.7/site-packages/jnpr/junos/utils/sw.py\", line 804, in install\n    'ISSU/NSSU requires Multi RE setup')\nTypeError: ISSU/NSSU requires Multi RE setup\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 0}

However, running the command on the device itself works:

request system software in-service-upgrade junos-srxsme-12.3X48-D50.6-domestic.tgz no-sync

The playbook I'm using is:

---
- name: Install Junos OS
  hosts: LAB-SRX550-node0
  gather_facts: no
  roles:
    - Juniper.junos
  connection: local

  tasks:
    - name: Install Junos OS Package
      juniper_junos_software:
        provider: "{{ cli }}"
        version: 12.1X44-D40.2
        local_package: /etc/ansible/images/junos-srxsme-12.1X44-D40.2-domestic.tgz
        issu: yes
      register: response
    - name: Print the response
      debug:
        var: response
prime001 commented 4 years ago

Did anyone ever resolve this?

rahkumar651991 commented 4 years ago

@prime001 - We now support single-RE ISSU. If still facing the issue, kindly share the netconf trace for it.

https://www.juniper.net/documentation/en_US/junos/topics/example/netconf-traceoptions-configuring.html

mersl commented 3 years ago

I hit the same issue today as I tried to ISSU upgrade a SRX345 cluster. I assume, I have quite recent modules installed (please check below python and ansible module versions).

Current running Junos: 18.4R3 Target Junos: 19.4R3-S3.3

I did set the traceoptions as advised at the earlier post and it stopped on VC information collection:

May 18 14:41:32 [NETCONF] - [62457] Incoming: <?xml version="1.0" encoding="UTF-8"?></nc:rpc>]]>]]> May 18 14:41:32 [NETCONF] - [62457] Outgoing: May 18 14:41:32 [NETCONF] - [62457] Outgoing: May 18 14:41:32 [NETCONF] - [62457] Outgoing: protocol May 18 14:41:32 [NETCONF] - [62457] Outgoing: operation-failed May 18 14:41:32 [NETCONF] - [62457] Outgoing: error May 18 14:41:32 [NETCONF] - [62457] Outgoing: syntax error May 18 14:41:32 [NETCONF] - [62457] Outgoing: May 18 14:41:32 [NETCONF] - [62457] Outgoing: get-virtual-chassis-information May 18 14:41:32 [NETCONF] - [62457] Outgoing: May 18 14:41:32 [NETCONF] - [62457] Outgoing: May 18 14:41:32 [NETCONF] - [62457] Debug: The last token parsed by mgd [62457] was [get-virtual-chassis-information] and gram data current token [get-virtual-chassis-information] May 18 14:41:32 [NETCONF] - [62457] Outgoing: May 18 14:41:32 [NETCONF] - [62457] Outgoing: ]]>]]>

BTW: the previous RPC call was - checking REs

May 18 14:41:30 [NETCONF] - [62457] Incoming: <?xml version="1.0" encoding="UTF-8"?></nc:rpc>]]>]]> May 18 14:41:30 [NETCONF] - [62457] Outgoing: May 18 14:41:30 [NETCONF] - [62457] Outgoing: May 18 14:41:30 [NETCONF] - [62457] Outgoing: May 18 14:41:31 [NETCONF] - [62457] Outgoing: ... May 18 14:41:31 [NETCONF] - [62457] Outgoing: May 18 14:41:31 [NETCONF] - [62457] Outgoing: May 18 14:41:31 [NETCONF] - [62457] Outgoing: May 18 14:41:31 [NETCONF] - [62457] Outgoing: May 18 14:41:31 [NETCONF] - [62457] Outgoing: ]]>]]>

also showing information (serial numbers, fpc etc etc) for both SRX cluster nodes as expected.


$ ansible-galaxy collection list

~/.ansible/collections/ansible_collections Collection Version


ansible.netcommon 2.0.2
ansible.utils 2.1.0
community.general 3.0.2
juniper.device 1.0.0
junipernetworks.junos 2.1.0

$ python3 -m pip list Package Version


ansible 3.3.0 ansible-base 2.10.9 bcrypt 3.2.0 cffi 1.14.5 cryptography 3.4.7 importlib-resources 5.1.2 Jinja2 2.11.3 jmespath 0.10.0 junos-eznc 2.6.0+3.g2358a3b jxmlease 1.0.3 lxml 4.6.3 MarkupSafe 1.1.1 ncclient 0.6.9 netaddr 0.8.0 packaging 20.9 paramiko 2.7.2 pip 21.1.1 pycparser 2.20 PyNaCl 1.4.0 pyparsing 2.4.7 pyserial 3.5 PyYAML 5.4.1 scp 0.13.3 setuptools 39.2.0 six 1.16.0 transitions 0.8.8 wheel 0.36.2 xmltodict 0.12.0 yamlordereddictloader 0.4.0 zipp 3.4.1

mersl commented 3 years ago

I digged deeper in and it seems, the issue is close to here

if I check my facts gotten from the SRX Cluster, there is no element 'localre' included the array is: "current_re": [ "node0", "master", "fpc0", "node", "fwdd", "member", "pfem", "re0", "fpc0.pic0" ]

out of curiosity, I changed the localre to master and installation starts (kwargs: {"no-sync" : true ) is set as it's a ICU style ISSU on my 345s)

is there an issue in facts which causing a missing 'localre'? file show /etc/hosts.junos | match localre gives the expected result btw

chidanandpujar commented 1 year ago

Hi @mersl @prime001 @stuartianbrown Thanks , Could you please verify the issue with latest PyEZ and share the netconf trace logs . https://www.juniper.net/documentation/en_US/junos/topics/example/netconf-traceoptions-configuring.html

Thanks

astritm commented 9 months ago

Encountering exactly the same issue when using juniper.device.software collection, to upgrade a cluster of SRX300 boxes, with ISSU and setting "no-sync": true.

Have there been any solutions or workarounds identified?

Alternatively, I managed to perform the upgrade using the following task. However, it requires some adjustment as the RPC never receives a response, leading to eventual timeout. To address this, consider using ignore_errors: true in conjunction with a default timeout.

- name: "Upgrade JunOS {{ osversion }} on {{ inventory_hostname }}"
  juniper.device.rpc:
    rpcs:
      - request-package-in-service-upgrade
    kwargs:
      no-sync: true
      package-name: "/var/tmp/{{ ospackage }}"
    logfile: "{{ logdir }}/{{ inventory_hostname }}-upgrade.txt"
    port: "{{ ncf_port }}"
    timeout: "{{ wait }}"
  ignore_errors: true
  register: upgrade_response

My next objective is to retrieve the upgrade status for both nodes and present it to the user. Thinking of getting the status from netconf traceoptions (via ansible) if nothing else works.