Closed tyler-8 closed 6 years ago
You should enable debugging and post the output of the devices debug file.
@laf Appreciate the reply. I turned on debug and the entire config was pulled via SSH and the session closed with an exit
successfully. So it would seem that the entire config is not making it into the local git repo? If I download the config through Oxidized it's incomplete however.
Can you post it? Just trim the config so no sensitive info is in.
I can't I'm afraid (it's a dev environment and I can only access it through a funky web client with no copy/paste ability right now).
I have manually combed the debug output and aside from some parse errors for non-existent commands I don't see any errors. Attachment of the parse errors.
Is that not the device telling you those commands aren't running?
@laf The commands that return the Parse error are expected based on the AOSW model - it's just attempting them without failing the whole workflow, but the show running-config
does return correct output as shown in the screenshot - I've cut the full output of the running config in the screenshot.
Ok but this isn't showing the oxidized part to see what's going on. Can you not run a cli pastebin client to upload the text?
I'm using the debug config outlined here: https://github.com/ytti/oxidized#debugging and all of the files it generates only contain the SSH outputs from the devices, even the devices that are working correctly in oxidized. The last line I see on the debug file for example is test_vc#exit
. Is there another way to enable debugging for Oxidized's processing?
No I think that's it but not 100% sure.
Sorry I'm out of ideas :(
@laf I appreciate you trying nonetheless - I'm not a Ruby guy otherwise I'd be approaching this differently. I'll continue to tinker as I have time until I or someone else figures it out.
Curious, are you running Centos7? I'm running into the same issue. Seems to only affect Aruba devices, everything else works fine. I've also tried changing the output from git to file and the issue remains. I have several other devices that have MUCH larger configs, so it doesn't feel like a ssh buffer issue.
@chipgwyn Yes, running CentOS7. It's definitely not an SSH issue because the debug output shows the entire contents of the configuration. It's only when Oxidized is taking that data and parsing it somehow that it's getting cut short. I've looked through the config to see if there were some sort of special characters and there isn't anything of concern there.
I tried pairing down all the stuff in aosw.rb to only the "show running-config" command, issue still exists.
@chipgwyn Good validation there, that was going to be my next step. I'm beginning to wonder if something from one of these functions is causing the issue:
out = []
cfg.each_line do |line|
out << line.rstrip
end
out = out.join "\n"
out << "\n"
end
def clean cfg
out = []
cfg.each_line do |line|
# drop the temperature, fan speed and voltage, which change each run
next if line.match /Output \d Config/i
next if line.match /(Tachometers|Temperatures|Voltages)/
next if line.match /((Card|CPU) Temperature|Chassis Fan|VMON1[0-9])/
next if line.match /[0-9]+\s+(RPMS?|m?V|C)/i
out << line.strip
end
out = comment out.join "\n"
out << "\n"
end```
The confounding thing is its not consistent. Where it breaks in the file seems to be different and it doesn't do it every time. I was thinking maybe its something to do with the number of devices. I ran oxidized with only three devices; two controllers and one normal cisco switch. The cisco pulled perfectly every time, the controllers seemed kind of random.
I think my next step is to check the file that's saved with the ssh debug and see if there are any odd characters that may throw things off. Perhaps a LF/CRLF issue or something... Very odd.
I've observed the same behavior. Here are a couple screenshots of the "changes" that oxidized sees, even though they aren't actually changing, it just fails to parse at different parts of the config.
I see this has the 'bug' label added to it. Anyone identified the issue? I'd be happy to provide any output or troubleshooting data, even do an interactive session if needed. Just let me know what data is needed.
same problem here using the firewareos model. tested in output file or git. timeout has been ported to 300 sec, but output from device takes about 40 sec in total. seems the app is not processing the full length of the data returned from the commands run
If someone can provide access to a device then I can probably take more of a look?
@laf I can escort you into a device for a bit, help gather data and such. I'll hit you in IRC and can work out the details. Probably will be Monday or Tuesday before I can arrange something. Thanks!
Sounds good, I'm on the gitter channel so just highlight me when you're ready.
So it looks like changing prompt /^\(?.+\)?\s?[#>]/
to prompt /^\(?.+\)?\s[#>]/
works for @chipgwyn
@tyler-8 Can you check if that works for you? Basically it makes the space between something #
mandatory rather than optional. I don't know these devices to know if that's correct or not.
@vppencilsharpener @thanegill You've made changes to this model before, does that look correct now to you?
I've been experiencing this same issue. Looks like this commit 4e6fc650d326e146558627fd6f13ac301fe24450 may be the breaking change. I'll change my prompt to the one that @laf suggested and do some testing.
I thought the user who had committed that had closed their github account but they haven't.
@vppencilsharpener can you take a look at this and see if the proposed change above will break things for you?
@thanegill Be great if you can report back on your findings.
@laf That new prompt seems to be working. I haven't had anything flapping of the config since making the change (2 day). Where it used to be almost every time.
I apologize for being late to the party on this one. I had actually given up on my installation until yesterday.
Any chance we can get more information on what devices are being used by others? I'm wondering if the IAP is different enough that it needs to be handled in a special way or if the problem could be elsewhere.
It looks like making the space mandatory breaks the compatibility with the Instant APs (IAP), or at least firmware version 6.5.4.3_61959.
The prompt on my device looks like this: APHostname#
Debug output complains about the prompt.
D, [2018-04-02T12:40:26.357941 #3773] DEBUG -- : lib/oxidized/input/ssh.rb: Connecting to IAPControllerIP D, [2018-04-02T12:40:27.101196 #3773] DEBUG -- : lib/oxidized/input/ssh.rb: expecting [/^(?.+)?\s[#>]/] at IAPControllerIP
I had previously seen the problem with "changes" that were not really changes.
I'm using the
https://github.com/ytti/oxidized/blob/master/lib/oxidized/model/aosw.rb
model and it doesn't appear to be grabbing the entire config. Looking through the diffs, it appears that it grabs varying levels of the configuration each time. Some times it'll make it further into the config than other times. I've verified there's no paging issue (show running-config
just dumps the entire config in one screen). So I'm not sure what to tweak.Is the session being closed too early? Is there a setting I can tweak that would improve this behavior? It doesn't appear to be a connection speed or stability issue either. I have other devices (IOS/Procurve) that are being pulled successfully from the same location, it's only the AOSW device.
I'm happy to provide any other information to help narrow in on the cause. Thanks in advance!