aruba / aoscx-ansible-collection

Ansible collections for AOS-CX switches 
43 stars 23 forks source link

Collection modules do not logoff from the session after an action is performed #95

Open muralihcl opened 5 months ago

muralihcl commented 5 months ago

Hi Team, when we were demonstrating the different use cases written using the aoscx ansible collections, it was observed the session regularly get limited as previous sessions do not get terminated automatically. This forces us to wait until the sessions timeout on their own (switch default setting has max 6 connections with timeout of 20 minutes). Though we've reduced the timeout to 5 minutes so that the session gets terminated faster, we were looking for a built in method to tackle this. So far, there is no documented way to use the same session for all the tasks with collections.

While I was working with API, we used to save the session token, use the same session for all the actions and then terminate the session at the end. Is there any method with which we can achieve this with the help of collections?

Please let me know if there are any hints.

With regards,

Muralidhara Kakkunje

alagoutte commented 5 months ago

hi @muralihcl

What your playbook and ansible aruba collection version ?

because i don't get the same issue...

tchiapuziowong commented 5 months ago

@muralihcl can you please share your pyaoscx & aoscx Ansible collection version? Are you receiving timeout errors after multiple executions? Our provider handles the logout handshake after the REST API call has been completed for a specific module/feature. This is done on a per task basis. So you should see a login & logout for each task in the playbook.

muralihcl commented 5 months ago

Hi @tchiapuziowong ,

This is the version we have in our execution environment. pyaoscx version 2.5.1 arubanetworks.aoscx version 4.3.1

Yes, after multiple executions we were getting "Session limit exceeded" error (at that time, the default timeout was 20 minutes and maximum sessions for a user was 6). This was never observed when we were developing the playbooks. During a demonstration, as they playbooks were ready and we'd used the same credentials, it reached the session limit and started giving trouble. Then we changed the settings to 5 minutes timeout and maximum 8 sessions per user, though this helped us to overcome the issue momentarily, I'm afraid, if we have a workflow with multiple job templates stitched together, we might start getting error by the time the job reaches the final template. Considering that, if we have a way to terminate the session at the end of the task, it could be of great use.

With regards,

Muralidhara Kakkunje

stephanelechner commented 4 months ago

Hi @tchiapuziowong any update on this ?

alagoutte commented 4 months ago

Hi @tchiapuziowong any update on this ?

What switch model / firmware release ? (and also an example of playbook with this issue), i will be help for try to reproduce the issue

tchiapuziowong commented 4 months ago

@stephanelechner @muralihcl Could you share the particular error output as well as the playbook? if this isn't information you can share in this forum please escalate internally to your SE

muralihcl commented 4 months ago

Hi @alagoutte / @tchiapuziowong,

Firmware version: XL.10.12.1000 Product: JL375A 8400 Base

Ansible task snippet: Gather facts

- name: Get VLAN details to check if the VLANs exist
  no_log: "{{ no_log_param }}"
  arubanetworks.aoscx.aoscx_facts:
    gather_subset:
      - host_name
    gather_network_resources:
      - vlans
  register: reg_vlan_details
  vars:
    ansible_host: "{{ aruba_switch_item['aruba_ip_address'] }}"
    ansible_user: "{{ aruba_username }}"
    ansible_password: "{{ aruba_password }}"
    ansible_network_os: arubanetworks.aoscx.aoscx
    ansible_connection: arubanetworks.aoscx.aoscx
    ansible_aoscx_validate_certs: false
    ansible_aoscx_use_proxy: false
    ansible_acx_no_proxy: true
    ansible_aoscx_rest_version: 10.09

Snippet: Configure L2 interface

- name: Configure the interface with the L2 values provided
  arubanetworks.aoscx.aoscx_l2_interface:
    interface: "{{ interface_item['input_values']['interface_name'] }}"
    vlan_mode: "{{ interface_item['input_values']['vlan_mode'] }}"
    vlan_access: "{{ interface_item['input_values']['vlan_access'] | default(omit) }}"
    trunk_allowed_all: "{{ interface_item['input_values']['trunk_allowed_all'] | default(omit) }}"
    vlan_trunks: "{{ interface_item['input_values']['vlan_trunks'] | default(omit) }}"
    native_vlan_id: "{{ interface_item['input_values']['native_vlan_id'] | default(omit) }}"
    description: "{{ interface_item['input_values']['interface_description'] | default(omit) }}"
    state: create
  register: reg_config_interface_l2
  vars:
    ansible_host: "{{ interface_item['input_values']['aruba_ip_address'] }}"
    ansible_user: "{{ aruba_username }}"
    ansible_password: "{{ aruba_password }}"
    ansible_network_os: arubanetworks.aoscx.aoscx
    ansible_connection: arubanetworks.aoscx.aoscx
    ansible_aoscx_validate_certs: false
    ansible_aoscx_use_proxy: false
    ansible_acx_no_proxy: true
    ansible_aoscx_rest_version: 10.09

I hope this is adequate to corelate. Every playbook has a fact gathering step which will then be used for verification of pre-requisites and then is followed by one or two configuration tasks. So, every play will have at least 2 sessions per action. We got issue when I ran multiple use cases during a demonstration (may be within 20 minutes, I would have run 5 use cases, meaning while establishing 9th or 10th session, the very first one caused trouble).

Thank you,

Muralidhara Kakkunje

tchiapuziowong commented 4 months ago

Thank you for the information @muralihcl ! are you running this playbook multiple times against a single switch in a single thread or are you running multiple instances of this playbook at one time?

tchiapuziowong commented 4 months ago

Also can you check the logs of the device and verify if there are absolutely no logout REST calls? Do you see any?

muralihcl commented 4 months ago

Hi @tchiapuziowong, purposefully these tasks are designed to be single threaded. We have observed the issue when we ran individual job templates (consisting of two or three connections to Aruba switch) one after the other.

With regards,

Muralidhara Kakkunje

muralihcl commented 4 months ago

Also can you check the logs of the device and verify if there are absolutely no logout REST calls? Do you see any?

This is the task where we have seen the session limit exceeded error.

image
muralihcl commented 3 months ago

We can mark this as closed. When I tried to reproduce the behaviour, I could not. Even if I initiate multiple connections to the same switch within a span of less than 5 minutes, I could only see log-off messages on the switch logs. If I come across, I will raise the new issue.

muralihcl commented 3 months ago

During one of the meetings, it is told that depending on the access control, we can make use of the same session for rest of the stuffs within the playbooks. Have we included any new functionality to retrieve session token to avoid sessions getting consumed to the fullest?

image
tchiapuziowong commented 3 months ago

@muralihcl Apologies I'm confused at the ask, today it's designed so that only 1 session is consumed/used per task and closed at the end of that task. Are you still seeing sessions being consumed and not closed? You had previously stated you were not able to reproduce, has that changed? How were you able to reproduce?

muralihcl commented 3 months ago

Hi @tchiapuziowong ,

This is the Ansible workflow which launches individual job templates one after another. With this at any given point of time, only one task would be active and hence, only one session would be consumed.

image

When we tried to make the workflow bit interesting, by distributing the unrelated job templates to execute in parallel, we get session count issues. image

Finally we were able to just get this working with 2 layers, so only two sessions per switch would be active at any given point of time. image

I was just checking as in one of the webinars, it was mentioned about maintaining the session. If we have a way to login first, save the session and then perform rest of the tasks using the same session token, so that one job template would only consume one session until it is completed.