ytti / oxidized

Oxidized is a network device configuration backup tool. It's a RANCID replacement!
Apache License 2.0
2.8k stars 927 forks source link

Since update to 0.24.0 - Cisco configs truncated, some Procurve not polled correctly #1401

Closed hunnymonster closed 4 years ago

hunnymonster commented 6 years ago

Since updating to Oxidized 0.24.0 the config is not retrieved correctly for a number of devices - example in code block below:

! 
! 
! VTP: Cisco IOS Software, C3750 Software (C3750-IPBASE-M), Version 12.2(35)SE5, RELEASE SOFTWARE (fc1)
! VTP: Copyright (c) 1986-2007 by Cisco Systems, Inc.
! VTP: Compiled Thu 19-Jul-07 19:15 by nachen
! VTP: Image text-base: 0x00003000, data-base: 0x01080000
! VTP: ROM: Bootstrap program is C3750 boot loader
! VTP: BOOTLDR: C3750 Boot Loader (C3750-HBOOT-M) Version 12.2(25r)SEE4, RELEASE SOFTWARE (fc1)
! VTP: BGHAC-SW02 uptime is 5 days, 10 hours, 39 minutes
! VTP: System returned to ROM by power-on
! VTP: System restarted at 23:02:39 BST Sat Jun 9 2018
! VTP: System image file is "flash:c3750-ipbase-mz.122-35.SE5/c3750-ipbase-mz.122-35.SE5.bin"
! VTP: cisco WS-C3750G-48PS (PowerPC405) processor (revision F0) with 118784K/12280K bytes of memory.
! VTP: Processor board ID FOC1046Z6TY
! VTP: Last reset from power-on
! VTP: 2 Virtual Ethernet interfaces
! VTP: 52 Gigabit Ethernet interfaces
! VTP: The password-recovery mechanism is enabled.
! VTP: 512K bytes of flash-simulated non-volatile configuration memory.
! VTP: Base ethernet MAC Address       : 00:1A:2F:B2:60:80
! VTP: Motherboard assembly number     : 73-10216-08
! VTP: Power supply part number        : 341-0108-03
! VTP: Motherboard serial number       : FOC104615G4
! VTP: Power supply serial number      : DCA1042A0BS
! VTP: Model revision number           : F0
! VTP: Motherboard revision number     : B0
! VTP: Model number                    : WS-C3750G-48PS-S
! VTP: System serial number            : FOC1046Z6TY
! VTP: Top Assembly Part Number        : 800-26853-01
! VTP: Top Assembly Revision Number    : A0
! VTP: Version ID                      : V05
! VTP: CLEI Code Number                : CNMWN00ARC
! VTP: Hardware Board Revision Number  : 0x09
! VTP: Switch   Ports  Model              SW Version              SW Image            
! VTP: ------   -----  -----              ----------              ----------          
! VTP: *    1   52     WS-C3750G-48PS     12.2(35)SE5             C3750-IPBASE-M      
! VTP: Configuration register is 0xF
! 
! VTP Version                     : 2
! Configuration Revision          : 83
! Maximum VLANs supported locally : 1005
! Number of existing VLANs        : 61
! VTP Operating Mode              : Client
! VTP Domain Name                 : #######
! VTP Pruning Mode                : Enabled
! VTP V2 Mode                     : Disabled
! VTP Traps Generation            : Disabled
! MD5 digest                      : 0x21 0x4F 0xA7 0x19 0xA0 0x63 0x91 0x64 
! Configuration last modified by 10.254.11.1 at 5-23-18 13:33:24
NAME: "GigabitEthernet1/0/51", DESCR: "1000BaseSX SFP"
PID:                     , VID:     , SN: AGM1447L1E2     

NAME: "GigabitEthernet1/0/52", DESCR: "1000BaseSX SFP"
PID:                     , VID:     , SN: AGM1105J38F     

No running config retrieved at all...

On the Procurve (predominantly 2512 & 2524) - the telnet session establishes, logs in and then does nothing to retrieve - just times out leading to failure.

ytti commented 6 years ago

Any chance to get sanitised system online?

ytti commented 6 years ago

Is the C3750 also fetched via telnet?

ytti commented 6 years ago

If c3750 is also telnet, it's likely the same issue in both boxes, and hopefully just one of them, which ever is easiest to put online in sanitised manner, would be sufficient.

hunnymonster commented 6 years ago

Possibly - need to think how to put one online... In the middle of a LAN refresh project in a healthcare setting

ytti commented 6 years ago

I'm surprised no more complain in a week, if it is actually broken. Still very happy to work on this, if I can get access to affected device.

hunnymonster commented 6 years ago

@ytti where in the world are you? Might be easier for me to physically send you the device :)

davama commented 6 years ago

I have a similar model but dont experience your issue @hunnymonster I force a config update but all looks normal

! Cisco IOS Software, C3750E Software (C3750E-UNIVERSALK9-M), Version 15.2(4)E, RELEASE SOFTWARE (fc2)
! 
! Image: Software: C3750E-UNIVERSALK9-M, 15.2(4)E, RELEASE SOFTWARE (fc2)
! Image: Compiled: Mon 28-Sep-15 20:06 by prod_rel_team
! Image: flash:c3750e-universalk9-mz.152-4.E.bin
! Chassis type: WS-C3750X-12S
! Memory: main 524288K
! Processor ID: FDO1619Z11B
! CPU: PowerPC405
@@ -54,7 +54,7 @@
 ! 
 ! 
 !
-! Last configuration change at 13:07:36 UTC Tue Jun 12 2018 by mgill
+! Last configuration change at 15:53:40 UTC Fri Jun 22 2018 by dvmacias
 ! NVRAM config last updated at 13:07:36 UTC Tue Jun 12 2018 by mgill
 !
 version 15.2
@@ -322,7 +322,6 @@ ip access-list standard SNMP-SERVER
 logging trap debugging
 logging origin-id hostname
 logging facility local0
-logging host ipv6 blabla
 !
 snmp-server community <redacted> RO ipv6 SNMP-SERVER6 SNMP-SERVER
 snmp-server location CORE
hunnymonster commented 6 years ago

I wonder if there's some show through from 0.23.0... if I get a chance on Monday, I'll try a clean install of all of the gems.

Big day - two stacks to replace (300+ connections) - planned downtime nil... Tis a good game...

ytti commented 6 years ago

@hunnymonster I'm in Cyprus, it's usually highest tier to ship things to inside EU.

I'm sorry for any trouble caused, and I'm very much willing to work on to solve the telnet issue.

hunnymonster commented 6 years ago

No trouble to me - like a good boy, I've got a production monitor and a test system... Would a wireshark capture of a session be any use? (Given its telnet and not encrypted at all)

Trying to think how to get you visibility at least.

rvandepu commented 6 years ago

Hi, I've been having this problem with multiple Cisco switches (2960 & 2950), even before the latest version. If I recall correctly, the problem was gone after disabling the retrieval of VTP information. I'll try that again.

If you need more information to troubleshoot this, please tell me what you need.

ytti commented 6 years ago

This is interesting, and might give us reason what is going on. If VTP output has some string which matches to prompt, we become desynchronised, VTP output stops there, and the next command output stops even before it starts, because we see the VTP's actual prompt and so forth.

hunnymonster commented 6 years ago

I have run up an entirely new instance of oxidized 0.24.0 and my Cisco issue has cleared. The Procurve one persists but only on HP Procurve 2524 (coincidentally, they're the only devices I have using local logins)

mrplow87 commented 6 years ago

I can confirm the HP Procurve issue with Procurve/Aruba 2530 48G Switch (J9775A) switches running firmware YA.15.14.0007. The same model but with older firmware can be polled without an issue. Also, various other HP switches work fine. All switches use the same method for login.

With a clean install, I can use oxs on those switches once, after that I only get "closed stream" or "unable to connect" errors. Oxidized itself can never poll the device.

debug_oxs.log debug_oxidized.log

jerebernard commented 6 years ago

I'm have the same issue for new HP Procurve model ex : 2920-24G. For the moment I didn't find any solution

martinohansen commented 6 years ago

We're also affected on HPE 2530 / J9772A switches running version: YA.15.12.0007 & YA.15.17.0009. We have the following in our log.

W, [2018-10-23T12:00:41.574423 #621]  WARN -- : 192.168.0.2 raised Oxidized::PromptUndetect with msg "Your previous successful login (as operator) was from 192.168.0.1 SWITCH>  not matching configured prompt (?-mix:^\r?([\w.-]+# )$)"
W, [2018-10-23T12:00:42.472674 #621]  WARN -- : SWITCH status no_connection, retry attempt 1
W, [2018-10-23T12:01:13.695231 #621]  WARN -- : 192.168.0.2 raised Oxidized::PromptUndetect with msg "Your previous successful login (as operator) was from 192.168.0.1 SWITCH>  not matching configured prompt (?-mix:^\r?([\w.-]+# )$)"
W, [2018-10-23T12:01:14.481242 #621]  WARN -- : SWITCH status no_connection, retry attempt 2
W, [2018-10-23T12:01:45.589876 #621]  WARN -- : 192.168.0.2 raised Oxidized::PromptUndetect with msg "Your previous successful login (as operator) was from 192.168.0.1 SWITCH>  not matching configured prompt (?-mix:^\r?([\w.-]+# )$)"
W, [2018-10-23T12:01:46.490674 #621]  WARN -- : SWITCH status no_connection, retry attempt 3
W, [2018-10-23T12:02:17.647341 #621]  WARN -- : 192.168.0.2 raised Oxidized::PromptUndetect with msg "Your previous successful login (as operator) was from 192.168.0.1 SWITCH>  not matching configured prompt (?-mix:^\r?([\w.-]+# )$)"
W, [2018-10-23T12:02:18.498470 #621]  WARN -- : SWITCH status no_connection, retries exhausted, giving up

1595 seems to be very similar to what we are experiencing. Let me know if i shall provide more information. Thanks.

mrplow87 commented 6 years ago

On newer models, there's the command no banner last-login. If set, I get a different results when trying to connect to the switch.

With "last-login" enabled: gems/oxidized-script-0.5.1/lib/oxidized/script/script.rb:113:in `connect': unable to connect (Oxidized::)

With "last-login" disabled: gems/net-ssh-4.1.0/lib/net/ssh/buffered_io.rb:102:in `send': closed stream (IOError)

Referencing https://github.com/ytti/oxidized/issues/1595

Edit: Seems to have been a one-time issue. I haven't been able to reproduce it again. But still worth mentioning.

Edit 2: When using ssh.rb from v0.21 with v0.24, the issue is gone. https://github.com/ytti/oxidized/issues/1436

Edit 3: I'd like to add that the "fix" to use v0.21's ssh.rb also solves a timeout issue with oxidized-script when running configuration commands (which have no output) on HP Procurve switches.

mrplow87 commented 5 years ago

In oxidized 0.26.3 the issue is still present.

So my current update process is to fetch the repo, install the oxidized Ruby gem and copy over the ssh.rb from 0.21.0

@martinohansen: Did you try the no banner last-login command? Otherwise: try to update your switch to the YA.16.xx branch of firmware. We have multiple devices on that firmware that can be polled without an issue (confirmed for YA.16.03.0004 and YA.16.02.0012).

martinohansen commented 5 years ago

@mrplow87 using the ssh.rb from 0.21.0 does not resolve our issue, unfortunately.

The command no banner is not available on: YA.15.12. We're properly just gonna update the switches, thanks for your input.

wk commented 4 years ago

Is this issue still active in a current version of Oxidized?

jerebernard commented 4 years ago

Hello, Yes still the issue on some procurve version

Le sam. 25 avr. 2020 à 13:59, Wild Kat notifications@github.com a écrit :

Is this issue still active in a current version of Oxidized?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ytti/oxidized/issues/1401#issuecomment-619368523, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTLE42EOG3R2FKDQ2WPLWLROLGCVANCNFSM4FFE4UVQ .

no-response[bot] commented 4 years ago

This issue has been automatically closed because there has been no response to our request for more information from the original author. The information that is currently in the issue is insufficient to take further action. Feel free to re-open this issue if additional information becomes available, or if you believe it has been closed in error.

Faddy96 commented 4 years ago

I've also had this issue on my company procurve switches, they are running in older versions and we're not gonna upgrade them any time soon. I solved it by changing the regex pattern, though I'm not confident with doing a PR with my solution.

dhapheto commented 3 years ago

I've also had this issue on my company procurve switches, they are running in older versions and we're not gonna upgrade them any time soon. I solved it by changing the regex pattern, though I'm not confident with doing a PR with my solution.

Hi DarkCatapulter,

Could you please share your regex pattern ? Thanks in advance

gbeekmans commented 3 years ago

Just another report that the latest version of Oxidized as of yesterday (showing as 0.28.0) is having issues with pulling config files from older devices. I'm having some issues with HP 3500yl and HP 2510 switches that run very old firmware. The 2530s with latest & greatest firmware don't have issues.

I, too, worked around the issues by copying ssh.rb from oxidized-0.21.0 into /var/lib/gems/2.7.0/gems/oxidized-0.28.0/lib/oxidized/input which makes SSH work again to those switches and pull configuration.

With the 0.28 version of ssh.rb, it can't SSH, period. The errors it encounters is "raised IOError with msg "closed stream" and then falls back to telnet which can't seem to deal with the various control characters that HP likes to output. Could just be an issue in the procurve.rb model. I have no Ruby experience to understand where to start fixing this.

There have been various similar reports of this problem.

davama commented 3 years ago

have you tried using a different reget prompt for those particular old switches? perhaps create an altername procurveold.rb model with that adjusted prompt.

maybe something like this.

cat  ~/.config/oxidized/model/procurveold.rb 
require 'oxidized/model/procurve.rb'

class PROCURVEOLD < Procurve
  prompt /^<myregex>$/
end

Havent tested...so you can look around how others have created their own modules