CiscoTestAutomation / unicon.plugins

pyATS Unicon Plugins
https://developer.cisco.com/pyats/
Apache License 2.0
23 stars 46 forks source link

Syslog messages interupting connection #71

Open sngx13 opened 1 year ago

sngx13 commented 1 year ago

Hi,

I've noticed that when Unicon sees a syslog message (console session) it stops and waits, in the example below, there was a RESTCONF connection made at the time Unicon has connected to device. Any actions stop from there on, connection will eventually timeout.

I'm guessing this has something to do with: SYSLOG_WAIT which by default is 1, I couldn't find any documentation around this value, could you please elaborate?

2022-11-23 11:33:17,885: %UNICON-INFO: +++ DUT-ISR4331-K9 logfile /opt/autom8_app/autom8_app/logs/pyats/DUT-ISR4331-K9.log +++

2022-11-23 11:33:17,885: %UNICON-INFO: +++ Unicon plugin iosxe (unicon.plugins.iosxe) +++
Trying 10.113.128.141...
Connected to 10.113.128.141.
Escape character is '^]'.

2022-11-23 11:33:17,914: %UNICON-INFO: +++ connection to spawn: telnet 10.113.128.141 10012, id: 140555820049808 +++

2022-11-23 11:33:17,914: %UNICON-INFO: connection to DUT-ISR4331-K9

Login: automation
Password:

2022-11-23 11:33:17,964: %UNICON-INFO: Executing post credential command: sendline()

Nov 23 11:33:46.605 GMT: %DMI-5-AUTH_PASSED: R0/0: dmiauthd: User 'automation' authenticated successfully from 10.113.128.165:0  for rest over http. External groups: PRIV15
Username:automation
Password:
Taarini commented 1 year ago

Hi

Can you please elaborate more on the issue you face.

If you want to adjust the value of SYSLOG_WAIT then please add the below in the testbed.yaml,

connections:
     settings:
         SYSLOG_WAIT: 1 <<
Taarini commented 1 year ago

Also there isn't any document specific to SYSLOG_WAIT. As you said the default value is 1

sngx13 commented 1 year ago

Hi @Taarini, as you can see flow log output, connection hangs straight after syslog message is produced when a RESTCONF session was made to the same device Unicon was trying to connect to.

sngx13 commented 1 year ago

Here is another example, i'm executing 'show ip bgp vpnv4 all neighbors' followed by 'ping', note output stops at:

_'Nov 25 15:26:01.124 GMT: %DMI-5-AUTHPASSED: R0/0: dmiauthd: User 'automation' authenticated successfully from 10.113.128.165:0 for rest over http. External groups: PRIV15'

2022-11-25 15:25:57,665: %UNICON-INFO: +++ connection to spawn: telnet 80.194.79.121 10014, id: 140147439829056 +++

2022-11-25 15:25:57,665: %UNICON-INFO: connection to DUT-C1117-4PM

Login: automation
Password: 

2022-11-25 15:25:57,712: %UNICON-INFO: Executing post credential command: sendline()

DUT-C1117-4PM#

2022-11-25 15:25:57,821: %UNICON-INFO: Learned hostname(s): 'DUT-C1117-4PM'.

2022-11-25 15:25:57,885: %UNICON-INFO: +++ DUT-C1117-4PM with via 'a': enable +++

2022-11-25 15:25:58,661: %UNICON-INFO: +++ DUT-C1117-4PM with via 'a': executing command 'show ip bgp vpnv4 all neighbors' +++
show ip bgp vpnv4 all neighbors
BGP neighbor is 100.64.0.1,  vrf DUT_VRF,  remote AS 5089, external link
  BGP version 4, remote router ID 10.65.0.18
  BGP state = Established, up for 2d00h
  Last read 00:00:02, last write 00:00:02, hold time is 15, keepalive interval is 5 seconds
  Last update received: 03:27:15
  Neighbor sessions:
    1 active, is not multisession capable (disabled)
  Neighbor capabilities:
    Route refresh: advertised and received(new)
    Four-octets ASN Capability: advertised and received
    Address family IPv4 Unicast: advertised and received
    Address family IPv6 Unicast: received
    Address family IPv4 Multicast: received
    Address family IPv6 Multicast: received
    Address family IPv4 Flowspec: received
    Address family IPv6 Flowspec: received
    Enhanced Refresh Capability: advertised
    Multisession Capability: 
    Stateful switchover support enabled: NO for session 1
  Message statistics:
    InQ depth is 0
    OutQ depth is 0
 --More--             
                         Sent       Rcvd
    Opens:                  1          1
    Notifications:          0          0
    Updates:                2          7
    Keepalives:         34204      34340
    Route Refresh:          0          0
    Total:              34207      34348
  Do log neighbor state changes (via global configuration)
  Default minimum time between advertisement runs is 0 seconds

 For address family: VPNv4 Unicast
  Translates address family IPv4 Unicast for VRF DUT_VRF
  Session: 100.64.0.1
  BGP table version 9, neighbor version 9/0
  Output queue size : 0
  Index 1, Advertise bit 0
  1 update-group member
  Slow-peer detection is disabled
  Slow-peer split-update-group dynamic is disabled
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               2          3 (Consumes 408 bytes)
 --More--             Prefixes Total:                 2          5
    Implicit Withdraw:              0          0
    Explicit Withdraw:              0          2
    Used as bestpath:             n/a          1
    Used as multipath:            n/a          0
    Used as secondary:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Bestpath from this peer:              4        n/a
    Total:                                4          0
  Number of NLRIs in the update sent: max 2, min 0
  Current session network count peaked at 4 entries at 13:00:25 Nov 24 2022 GMT (1d02h ago)
  Highest network count observed at 4 entries at 13:00:25 Nov 24 2022 GMT (1d02h ago)
  Last detected as dynamic slow peer: never
  Dynamic slow peer recovered: never
  Refresh Epoch: 1
  Last Sent Refresh Start-of-rib: never
  Last Sent Refresh End-of-rib: never
  Last Received Refresh Start-of-rib: never
  Last Received Refresh End-of-rib: never
 --More--                                Sent   Rcvd
    Refresh activity:          ----   ----
      Refresh Start-of-RIB          0          0
      Refresh End-of-RIB            0          0

  Address tracking is enabled, the RIB does have a route to 100.64.0.1
  Route to peer address reachability Up: 1; Down: 0
    Last notification 2d00h
  Connections established 1; dropped 0
  Last reset never
  External BGP neighbor configured for connected checks (single-hop no-disable-connected-check)
  Interface associated: GigabitEthernet0/0/0.400 (peering address in same link)
  Transport(tcp) path-mtu-discovery is enabled
  Graceful-Restart is disabled
  SSO is disabled
Connection state is ESTAB, I/O status: 1, unread input bytes: 0            
Connection is ECN Disabled, Mininum incoming TTL 0, Outgoing TTL 1
Local host: 100.64.0.0, Local port: 20424
Foreign host: 100.64.0.1, Foreign port: 179
Connection tableid (VRF): 2
Maximum output segment queue size: 50

Enqueued packets for retransmit: 0, input: 0  mis-ordered: 0 (0 bytes)

Event Timers (current time is 0xA7F6FCC):
Timer          Starts    Wakeups            Next
Retrans         34206          0             0x0
TimeWait            0          0             0x0
AckHold         34348      32407             0x0
SendWnd             0          0             0x0
KeepAlive           0          0             0x0
GiveUp              0          0             0x0
PmtuAger            1          1             0x0
DeadWait            0          0             0x0
Linger              0          0             0x0
ProcessQ            0          0             0x0

iss: 1737971400  snduna: 1738621417  sndnxt: 1738621417
irs:  678537274  rcvnxt:  679190114

sndwnd:  32768  scale:      0  maxrcvwnd:  16384
rcvwnd:  16118  scale:      0  delrcvwnd:    266

SRTT: 1000 ms, RTTO: 1003 ms, RTV: 3 ms, KRTT: 0 ms
minRTT: 0 ms, maxRTT: 1000 ms, ACK hold: 200 ms
uptime: 175125968 ms, Sent idletime: 2008 ms, Receive idletime: 2208 ms 
Status Flags: active open
Option Flags: VRF id set, nagle, path mtu capable
IP Precedence value : 6

Datagrams (max data segment is 1024 bytes):
Rcvd: 68553 (out of order: 0), with data: 34348, total data bytes: 652839
Sent: 67252 (retransmit: 0, fastretransmit: 0, partialack: 0, Second Congestion: 0), with data: 34206, total data bytes: 650016

 Packets received in fast path: 0, fast processed: 0, slow path: 0
 fast lock acquisition failures: 0, slow path: 0
TCP Semaphore      0xFFFF763EFA18  FREE 

DUT-C1117-4PM#
Nov 25 15:26:01.124 GMT: %DMI-5-AUTH_PASSED: R0/0: dmiauthd: User 'automation' authenticated successfully from 10.113.128.165:0  for rest over http. External groups: PRIV15
Nov 25 15:26:01.190 GMT: %SERVICES-2-NORESOLVE_ACTIVE: R0/0: linux_iosd-image: Error resolving active FRU: BINOS_FRU_CC
Nov 25 15:26:01.190 GMT: %SERVICES-2-NORESOLVE_STANDBY: R0/0: linux_iosd-image: Error resolving standby FRU: BINOS_FRU_CC
DUT-C1117-4PM#

Previous command run output succeeded :

DUT-C1117-4PM#

2022-11-25 15:25:11,848: %UNICON-INFO: +++ DUT-C1117-4PM with via 'a': executing command 'ping vrf DUT_VRF 100.64.0.1' +++
ping vrf DUT_VRF 100.64.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 100.64.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/4/4 ms
DUT-C1117-4PM#

It seems like any syslog messages interrupt the process.

Taarini commented 1 year ago

From the logs, it looks like it don't originate from unicon. Are you using any other custom modules

Nov 25 15:26:01.124 GMT: **%DMI-5-AUTH_PASSED**
sngx13 commented 1 year ago

Yes as described, a RESTCONF request (generating log message) to device whilst Unicon is connected seems to interrupt the process.

Taarini commented 1 year ago

Is it possible for you to share the logs with debug enabled, so that we can see what is happening.

sngx13 commented 1 year ago

Hi,

I've now set the boot variable and the issue has gone as the warning message has disappeared. I'm getting a consistent reload each time whereas before there was a race condition when reload could happen if message wasn't displayed in time.

Thank you

On Thu, 22 Dec 2022, 19:02 Taarini, @.***> wrote:

Is it possible for you to share the logs with debug enabled, so that we can see what is happening.

— Reply to this email directly, view it on GitHub https://github.com/CiscoTestAutomation/unicon.plugins/issues/71#issuecomment-1363190664, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CGUHGZQWVWTOZ7W3MN5DWOSJUFANCNFSM6AAAAAASI33JRM . You are receiving this because you authored the thread.Message ID: @.***>

Taarini commented 1 year ago

Sorry I'm not able to understand.

Correct me if I wrong. You are saying the problem got fixed by setting boot variable but you get another issue with reload?

sngx13 commented 1 year ago

Setting boot variable fixes the warning message when reload is issued which is the culprit of the original issue. I'm asking whether this warning could be added in the list of expected prompts within Unicon :-)

On Fri, 23 Dec 2022, 01:57 Taarini, @.***> wrote:

Sorry I'm not able to understand.

Correct me if I wrong. You are saying the problem got fixed by setting boot variable but you get another issue with reload?

— Reply to this email directly, view it on GitHub https://github.com/CiscoTestAutomation/unicon.plugins/issues/71#issuecomment-1363480577, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CGUHI7T4STFS4LRBH53TWOT2IFANCNFSM6AAAAAASI33JRM . You are receiving this because you authored the thread.Message ID: @.***>

Taarini commented 1 year ago

Hi

Can you please share the traceback of the error which you got and please set debug = True, while running the script again and share the logs.

Taarini commented 1 year ago

Hi

Can you please share the requested logs

sngx13 commented 1 year ago

Hi,

I've explained what's happening, a warning message is displayed when boot variable is not set which stops reload from going ahead. If boot var is set, message is not displayed and reload happens as normal, it's therefore not handled properly by your default dialogues.

Thank you

On Tue, 10 Jan 2023, 21:32 Taarini, @.***> wrote:

Hi

Can you please share the requested logs

— Reply to this email directly, view it on GitHub https://github.com/CiscoTestAutomation/unicon.plugins/issues/71#issuecomment-1377902640, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CGUDBE47AIIXSVV7AWOTWRXIOJANCNFSM6AAAAAASI33JRM . You are receiving this because you authored the thread.Message ID: @.***>

Taarini commented 1 year ago

Hi

Please share the logs with debug enabled so that we will have more information on what is happening. Right now from the examples you provided in the previous comment seems like the prompt is seen after the syslog message

DUT-C1117-4PM#
Nov 25 15:26:01.124 GMT: %DMI-5-AUTH_PASSED: R0/0: dmiauthd: User 'automation' authenticated successfully from 10.113.128.165:0  for rest over http. External groups: PRIV15
Nov 25 15:26:01.190 GMT: %SERVICES-2-NORESOLVE_ACTIVE: R0/0: linux_iosd-image: Error resolving active FRU: BINOS_FRU_CC
Nov 25 15:26:01.190 GMT: %SERVICES-2-NORESOLVE_STANDBY: R0/0: linux_iosd-image: Error resolving standby FRU: BINOS_FRU_CC
DUT-C1117-4PM#

If you look at the timestamp, it isn't taking much time to retrieve the prompt.

So for the case where you say when boot var is not set the program hangs, we need debug information so that we can exactly know which condition in code is causing that.

Similarly based on your observation, when boot var is set, the program runs smoothly. So please share the logs with debug enabled for this case

So we can compare both the logs and let you know on what can be done

Taarini commented 1 year ago

Hi

May I know whether you are planning to share the logs for boths cases with debug enabled