MJL85 / natlas

natlas - Network Discovery and Auto-Diagramming
GNU General Public License v2.0
514 stars 111 forks source link

ValueError: invalid literal for int() with base 0: '' #5

Open kamakazikamikaze opened 9 years ago

kamakazikamikaze commented 9 years ago

I ran the latest mnet.py graph solution and encountered an error. I'm not sure if it's because there are too many device entries or if my depth setting is too steep. I did a dry run by setting a depth of 0 and 1 to ensure that all dependencies were properly installed which finished successfully.

This instance was given a depth of 10. Not too sure if that's too deep but it did correctly detect devices down to a depth of 7.

I've uploaded the log to my Google Drive since I cannot directly upload .txt files. Though I've censored it it still shows all the steps and depth it was able to reach.

MJL85 commented 9 years ago

Looks like a bug. It's happening at the point where it's looking at the management IPs for CDP neighbors on akh-114-a. It could be bad data from snmp. Does the cdp neighbor info on that switch have any weird things going on?

edit: The depth is really only significant for limiting the used stack space. Since the thing is recursive we don't want to go 1000 functions deep. I'm pretty confident that python could probably handle a fairly large depth value, certainly much more than 10. So I don't think you're going to have any trouble from that aspect.

kamakazikamikaze commented 9 years ago

Funnily enough I'm currently SSH'd into that. The switch is a C3750 stack with two members. The G1/0/1 ports on both are connected to the AKH switch stack (one as a redundant in case of failure). Perhaps we're stuck in a loop? I would doubt it because many, if not all, C3750 and C3850 stacks have at least one redundant feed and a depth of 1 ran successfully.

Here's the CDP output.

akh-114-a#sho cdp nei
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,
                  D - Remote, C - CVTA, M - Two-port Mac Relay

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
akh-1a-a.dcs.byu.edu
                 Gig 2/0/1         127              S I   WS-C3750- Gig 1/0/2
akh-1a-a.dcs.byu.edu
                 Gig 1/0/1         165              S I   WS-C3750- Gig 2/0/2
SEP000DBCCC6AF7  Fas 1/0/31        143             H P M  IP Phone  Port 1
SEP000DBCCC6750  Fas 1/0/32        126             H P M  IP Phone  Port 1
SEP000DBCD9409D  Fas 1/0/28        157             H P M  IP Phone  Port 1
SEP00082166EDC6  Fas 1/0/6         162             H P M  IP Phone  Port 1
SEP0006D74B186C  Fas 1/0/25        138             H P M  IP Phone  Port 1
SEP081FF3636C98  Fas 1/0/20        179             H P M  IP Phone  Port 1
SEP0007EB2F8489  Fas 1/0/35        150             H P M  IP Phone  Port 1
SEP000821D1B912  Fas 1/0/11        172             H P M  IP Phone  Port 1
SEP64AE0C5FDBCE  Fas 1/0/7         167             H P M  IP Phone  Port 1
SEPD0574C6AC535  Fas 1/0/38        125             H P M  IP Phone  Port 1
SEP1C1D86C56A95  Fas 1/0/39        161             H P M  IP Phone  Port 1
SEP000DBCCC6C72  Fas 1/0/29        171             H P M  IP Phone  Port 1
SEPA0CF5B801459  Fas 1/0/48        147             H P M  IP Phone  Port 1
SEPA418758ADC75  Fas 1/0/41        124             H P M  IP Phone  Port 1
SEP54781A1CFC91  Fas 1/0/47        123             H P M  IP Phone  Port 1
SEP000DBCD94099  Fas 1/0/19        150             H P M  IP Phone  Port 1
SEP00097CEC99B4  Fas 1/0/21        124             H P M  IP Phone  Port 1
SEP3037A616B384  Fas 1/0/46        123             H P M  IP Phone  Port 1
AKH-114-100-AN1  Fas 1/0/1         9                 S    Xirrus XR Gig1

Nothing here seems out of the ordinary (from my novice observation). I know that redundant feeds aren't an issue because the depth 1 run had this in its 'finished' output:

[None] dc-3n604e-a-corea:gi7/21 -(UNKNOWN)-> dc-3n620e-b-cr11:gi7/10
    UNKNOWN -> None
[None] dc-3n604e-a-corea:gi7/22 -(UNKNOWN)-> dc-3n620e-b-cr11:gi7/7
    UNKNOWN -> None
[None] dc-3n604e-a-corea:gi7/23 -(UNKNOWN)-> dc-3n620e-b-cr11:gi7/8
    UNKNOWN -> None
[None] dc-3n604e-a-corea:gi7/24 -(UNKNOWN)-> dc-3n620e-b-cr11:gi7/9

Router AKH-1A-A doesn't have anything out of the ordinary either:

Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,
                  D - Remote, C - CVTA, M - Two-port Mac Relay

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
dc-3n819e-a-ca.dcs.byu.edu
                 Gig 2/0/1         134             R S I  WS-C6509- Gig 2/11
dc-3n819e-a-ca.dcs.byu.edu
                 Gig 1/0/1         174             R S I  WS-C6509- Gig 1/11
akh-300a-a.dcs.byu.edu
                 Gig 2/0/3         174              S I   WS-C3750- Gig 1/0/1
akh-300a-a.dcs.byu.edu
                 Gig 1/0/3         176              S I   WS-C3750- Gig 2/0/1
SEP081FF363692E  Fas 1/0/27        175             H P M  IP Phone  Port 1
SEPC0626B63F6D4  Fas 1/0/28        135             H P M  IP Phone  Port 1
SEP64AE0CF7D2E3  Fas 1/0/2         139             H P M  IP Phone  Port 1
SEP000750036DE6  Fas 1/0/26        146             H P M  IP Phone  Port 1
AKH-114-134-AN1  Fas 1/0/12        5                 S    Xirrus XR Gig1
akh-114-a.dcs.byu.edu
                 Gig 2/0/2         173              S I   WS-C3750- Gig 1/0/1
akh-114-a.dcs.byu.edu
                 Gig 1/0/2         159              S I   WS-C3750- Gig 2/0/1

EDIT: I found that another member, AKH-114-100-AN1, was at the bottom of AKH-114-A's list which would have been printed out next. You don't have to proceed unless my findings are wrong. I'll leave this here for further debugging

So AKH-1A-A seems to be done as it has visited. Since it's a child node of 'dc-3n819e-a-ca' I decided to check that out:

dc-3n819e-a-ca#sho cdp nei
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,
                  D - Remote, C - CVTA, M - Two-port Mac Relay

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
lves-w160e-a.dcs.byu.edu
                 Gig 2/12          158              S I   WS-C3750- Gig 2/0/1
lves-w160e-a.dcs.byu.edu
                 Gig 1/12          172              S I   WS-C3750- Gig 1/0/1
akh-1a-a.dcs.byu.edu
                 Gig 2/11          125              S I   WS-C3750- Gig 2/0/1
akh-1a-a.dcs.byu.edu
                 Gig 1/11          147              S I   WS-C3750- Gig 1/0/1
TRPB-110-A1.dcs.byu.edu
                 Gig 2/18          152              S I   WS-C3560- Gig 0/1
dc-3n620e-b-cr11.dcs.byu.edu
                 Gig 2/1           135             R S I  WS-C6509- Gig 8/37
dc-3n620e-b-cr11.dcs.byu.edu
                 Gig 2/17          167             R S I  WS-C6509- Gig 8/12
dc-3n620e-b-cr11.dcs.byu.edu
                 Gig 1/17          175             R S I  WS-C6509- Gig 7/12
dc-3n620e-b-cr11.dcs.byu.edu
                 Gig 1/1           145             R S I  WS-C6509- Gig 7/37
mb-150-a.dcs.byu.edu
                 Gig 1/5           170              S I   WS-C3750- Gig 1/0/1
mlrp-144-a2.dcs.byu.edu
                 Gig 2/9           131              S I   WS-C3560- Gig 0/1
cone-225-a1.dcs.byu.edu
                 Gig 2/6           179              S I   WS-C3560- Gig 0/1
mlrp-144-a1.dcs.byu.edu
                 Gig 1/9           140              S I   WS-C3560- Gig 0/1
cone-225-a1.dcs.byu.edu
                 Gig 1/6           179              S I   WS-C3560- Gig 0/2
alln-339-a1.dcs.byu.edu
                 Gig 2/8           140              S I   WS-C3560- Gig 0/1
CANC-161-a1.dcs.byu.edu
                 Gig 2/10          130              S I   WS-C3560- Gig 0/4
CANC-161-a1.dcs.byu.edu
                 Gig 1/10          128              S I   WS-C3560- Gig 0/1
BRMB-151-A.dcs.byu.edu
                 Ten 8/7           178              S I   WS-C4506- Ten 1/2
BRMB-151-A.dcs.byu.edu
                 Ten 7/7           131              S I   WS-C4506- Ten 1/1
BRMB-265-A.dcs.byu.edu
                 Ten 8/6           154              S I   WS-C4506- Ten 1/1
BRMB-265-A.dcs.byu.edu
                 Ten 7/6           129              S I   WS-C4506- Ten 1/2
mc-1212-a.dcs.byu.edu
                 Gig 2/14          176              S I   WS-C3750V Gig 2/0/1
mc-1212-a.dcs.byu.edu
                 Gig 1/14          128              S I   WS-C3750V Gig 1/0/1
ppt3-loc3-a.dcs.byu.edu
                 Gig 2/2           129              S I   WS-C3750G Gig 2/0/10
ppt3-loc3-a.dcs.byu.edu
                 Gig 2/16          136              S I   WS-C3750G Gig 2/0/12
ppt3-loc3-a.dcs.byu.edu
                 Gig 1/16          177              S I   WS-C3750G Gig 1/0/12
ppt3-loc3-a.dcs.byu.edu
                 Gig 1/2           177              S I   WS-C3750G Gig 1/0/10
mp-190-a1.dcs.byu.edu
                 Gig 1/19          124              S I   WS-C3560- Gig 0/1
dc-4n442e-a1-RADIO.dcs.byu.edu
                 Gig 1/18          139              S I   WS-C3750X Gig 1/0/1
conf-loc2-b.dcs.byu.edu
                 Gig 2/15          129              S I   WS-C4506- Gig 1/4
conf-loc2-b.dcs.byu.edu
                 Gig 2/3           126              S I   WS-C4506- Gig 1/6
conf-loc2-b.dcs.byu.edu
                 Gig 1/15          139              S I   WS-C4506- Gig 1/3
conf-loc2-b.dcs.byu.edu
                 Gig 1/3           138              S I   WS-C4506- Gig 1/5
dc-3n504e-a-coreb.dcs.byu.edu
                 Ten 8/5           154             R S I  WS-C6509- Ten 8/14
dc-3n604e-a-corea.dcs.byu.edu
                 Ten 7/5           154             R S I  WS-C6509- Ten 8/14
ldsp-106a-a.dcs.byu.edu
                 Gig 2/13          160              S I   WS-C3750- Gig 2/0/1
ldsp-106a-a.dcs.byu.edu
                 Gig 1/13          160              S I   WS-C3750- Gig 1/0/1

So let's sort it out:

I don't know if your code has already entered LVES at this point because it's not printed out ('....LVES-W160E-A ( << IP >>)'). Assuming that it has, here is the CDP report:

lves-w160e-a#sho cdp nei
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,
                  D - Remote, C - CVTA, M - Two-port Mac Relay

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
dc-3n819e-a-ca.dcs.byu.edu
                 Gig 2/0/1         129             R S I  WS-C6509- Gig 2/12
dc-3n819e-a-ca.dcs.byu.edu
                 Gig 1/0/1         141             R S I  WS-C6509- Gig 1/12
LVES-E110F-A1.dcs.byu.edu
                 Gig 1/0/2         143              S I   WS-C3560- Gig 0/1
lves-l334a-a.dcs.byu.edu
                 Gig 3/0/1         157              S I   WS-C3750- Gig 2/0/1
lves-l334a-a.dcs.byu.edu
                 Gig 2/0/2         152              S I   WS-C3750- Gig 1/0/1
sttw-100-a1.dcs.byu.edu
                 Fas 2/0/48        160              S I   WS-C3560- Gig 0/1
SEP000DBCCC7127  Fas 1/0/1         127             H P M  IP Phone  Port 1
SEP000DBCD9409E  Fas 1/0/2         130             H P M  IP Phone  Port 1
SEPB8621F6D8D5C  Fas 1/0/17        132             H P M  IP Phone  Port 1
lves-w126b-a1.dcs.byu.edu
                 Gig 3/0/4         172              S I   WS-C3560- Gig 0/2
lves-w126b-a1.dcs.byu.edu
                 Gig 3/0/3         172              S I   WS-C3560- Gig 0/1
SEP081FF3638032  Fas 1/0/13        133             H P M  IP Phone  Port 1
lves-n101-a1.dcs.byu.edu
                 Gig 1/0/4         136              S I   WS-C3560- Gig 0/1
lves-n155-a1.dcs.byu.edu
                 Fas 2/0/45        152              S I   WS-C3560- Gig 0/1
SEP000DBCE97083  Fas 1/0/3         135             H P M  IP Phone  Port 1
lves-e194-a1.dcs.byu.edu
                 Gig 1/0/3         172              S I   WS-C3560- Gig 0/1
SEP000DBCCC6C88  Fas 1/0/18        143             H P M  IP Phone  Port 1
lves-n195-a1.dcs.byu.edu
                 Fas 2/0/44        123              S I   WS-C2940- Gig 0/1
lves-s156-a1.dcs.byu.edu
                 Gig 2/0/4         150              S I   WS-C3560- Gig 0/1
lves-w104-a1.dcs.byu.edu
                 Gig 2/0/3         129              S I   WS-C3560- Gig 0/1

At this point I'm not sure. If you need LVES-E110F's CDP report I can add that.

kamakazikamikaze commented 9 years ago

Whoops! Just noticed something. I know I went overboard here so let's step back near the top. Check AKH-114-A's CDP report. At the bottom of it's list it shows:

AKH-114-100-AN1  Fas 1/0/1         9                 S    Xirrus XR Gig1

This isn't in the output. So somehow it's stuck here. If you print output as soon as a device is found then it hasn't tried to enter it yet. If you aren't printing until after you check an SNMP request and look at the CDP table then maybe it's stuck here. I'll edit this when I check what it holds.

EDIT

AKH-114-100-AN1      10.23.69.20     Xirrus XR520, 512MB (300MHz)   Gig1             none  L2SW(switch)   

Let me see if other Xirrus devices include themselves (or nothing) in their tables. Perhaps the device doesn't have CDP enabled. It wouldn't show in AKH-114-A's table if so but our Xirrus devices don't play fair with Cisco's equipment

EDIT2

CDP is enabled. So I'm a little stumped. I'm running a depth 7 instance to see if it skips over this problem. I'll try a depth of 6 if I do

MJL85 commented 9 years ago

Would you be comfortable changing some of the code? I'd like to see what OID it polls and what the return value is at this point.

In graph.py Around line 151 try adding these print()s between these two lines

rip = snmpobj.cache_lookup(cdp_vbtbl, OID_CDP_IPADDR + '.' + ifidx + '.' + t[15])
print('cdp "%s" = "%s"' % (name, val))
print('cache_lookup(cdp, "%s") = "%s"' % ((OID_CDP_IPADDR + '.' + ifidx + '.' + t[15]), rip))
rip = convert_ip_int_str(rip)

Before doing this I just noticed this isn't a Cisco device. Can you do a show cdp n fa1/0/1 de on akh-114-a and see if there's a management IP address advertised?

kamakazikamikaze commented 9 years ago

I'll enter that once our other instances have finished. I understand that the scripts are compiled into a .pyc at runtime but I'm unaware of any potential problems if it tries to compile with the updated code while another .pyc is open and active in the directory.

Also, should/could we make it print out to a log instead of to the screen? I'm trying to run these in the background

MJL85 commented 9 years ago

You might be able to redirect the output from stdout and stderr to a file and fork it to the background?

# ( mnet.py -r 10.10.10.10 &> ./10.10.10.10.log ) & not tested

kamakazikamikaze commented 9 years ago

I'm using nohup and I figured it out. Very similar to your line. I'll add it in and run it at depth 10 again. I won't be able to report back until tomorrow when I'm back at work

EDIT

It may have actually been a result of our server dropping the connection to the vlan we need to access these devices. I reset my settings and am trying again. Time to get out of this cubicle!

kamakazikamikaze commented 9 years ago

Here's a log of depth 10 from the same starting node.

Same starting point but depth of 8.

Here's the log from an attempt of depth 2 in a different building.

Thus far I have had only one process successfully complete. All others have had this same issue.

MJL85 commented 9 years ago

Alright, looks like it's returning an empty string when you query the IP address OID.

Try changing line 188 of util.py from

if (iip != None):

to

if ((iip != None) | (iip != '')):

kamakazikamikaze commented 9 years ago

I added that half an hour ago and have it running. Question though now that I've re-read it, should it be if ((iip != None) & (iip != '')): since we don't want to proceed if iip = ''?

MJL85 commented 9 years ago

Yes! Sorry about that. Doh

On Jul 17, 2015, at 12:57 PM, Kent Coble notifications@github.com wrote:

I added that half an hour ago and have it running. Question though now that I've re-read it, should it be if ((iip != None) & (iip != '')): since we don't want to proceed if iip = ''?

— Reply to this email directly or view it on GitHub.

kamakazikamikaze commented 9 years ago

I made the edit and ran a shallow instance to test. We've successfully detected when iip = '' but now we're stopped when an UNKNOWN is returned:

cdp "1.3.6.1.4.1.9.9.23.1.2.1.1.6.112.20" = "dc-3n504e-a-coreb.dcs.byu.edu"
cache_lookup(cdp, "1.3.6.1.4.1.9.9.23.1.2.1.1.4.112.20") = "0x0a030159"
cdp "1.3.6.1.4.1.9.9.23.1.2.1.1.6.113.7" = "node0"
cache_lookup(cdp, "1.3.6.1.4.1.9.9.23.1.2.1.1.4.113.7") = ""
Traceback (most recent call last):
  File "mnet.py", line 183, in <module>
    main(sys.argv[1:])
  File "mnet.py", line 61, in main
    graph(argv[1:])
  File "mnet.py", line 119, in graph
    graph.crawl_node(opt_root_ip, opt_depth)
  File "/srv/samba/Share/Students/Kent's crap/workspace/mnet/mnetsuite/graph.py", line 198, in crawl_node
    self.crawl_node(child, depth-1)
  File "/srv/samba/Share/Students/Kent's crap/workspace/mnet/mnetsuite/graph.py", line 157, in crawl_node
    if (self.is_node_allowed(rip) == 0):
  File "/srv/samba/Share/Students/Kent's crap/workspace/mnet/mnetsuite/graph.py", line 207, in is_node_allowed
    ipaddr = IPAddress(ip)
  File "/usr/local/lib/python2.7/dist-packages/netaddr/ip/__init__.py", line 306, in __init__
    'address from %r' % addr)
netaddr.core.AddrFormatError: failed to detect a valid IP address from 'UNKNOWN'
MJL85 commented 9 years ago

Hmm.. I think we should allow these nodes by default. You can add this if-statement to graph.py to get around the error you've run across.

def is_node_allowed(self, ip):
    if (ip == 'UNKNOWN'):
        return 1

    ipaddr = None
    if (USE_NETADDR):

But because of that we would also need to stop an unknown from being crawled, they can only be leafs on the graph. So you'll need to add this too

def crawl_node(self, ip, depth):
    if ((self.is_node_allowed(ip) == 0) | (ip == 'UNKNOWN')):
        return

Thanks for all the help btw!

kamakazikamikaze commented 9 years ago

Ok, I've applied the changes and I'll start with a shallow depth run for the sake of time. If that passes I'll run it at a depth of 10 and report when it's all finished.

Glad to see I'm actually contributing. Was this script originally intended for smaller/more layered networks? It looks like output works well if there are fewer devices in breadth and more in depth. Perhaps these issues are raising only because of how poorly implemented our network is :stuck_out_tongue_winking_eye:

EDIT

Depth of 2 passed. Will add new comment when 10 passes. SVG also looks ok but I haven't, and likely will not, checked if it's completely accurate.

kamakazikamikaze commented 9 years ago

It worked! Took a long time but it finished and exported to an SVG. Check it out!

MJL85 commented 9 years ago

Holy cow that's one big diagram! Is that a network at a single location or is it spanning multiple sites somehow? Either way, good luck trying to draw something like that and keep it up to date in visio, that would be a full time job. Lol.. glad it worked for ya! I see a couple oddities in it though, some work that could be done here and there.

For instance, some of the nodes aren't showing the correct platform. That's something I found rather frustrating as Cisco seems to put them in different places depending on the platform type, which is the exact information I'm trying to find out. sfh-103-b, sfh-strack-a, conf-loc2-b, and rb-239-a are a few. Also noticed asb-b223-b looped back on itself. Could be a transparent bridge is there?

kamakazikamikaze commented 9 years ago

This is our network starting at the entry router. It is indeed connecting to multiple buildings if that's what you mean by sites. Like I said, there are over 3300 VLANs and tons of routers/firewalls/filters for it all.

Let me help you out:

If you need help determining the SNMP values for the specific platforms I may be able to provide them. I actually was looking it up a few weeks ago to check for the PSUs on C4506s. The C3850s will definitely be different than the others since it acts as a stack, thereby reiterating indices (ex. 1000 is switch 1, 2000 is switch 2, ...)

EDIT

Looking at v0.2 (I haven't updated yet) I see that you're checking for:

    OID_PLATFORM1 = ENTITY-MIB::entPhysicalDescr
    OID_PLATFORM2 = CISCO-ENTITY-ASSET-MIB::ceAssetTag
    OID_PLATFORM3 = CISCO-ENHANCED-IMAGE-MIB::ceImageFamily
    OID_PLATFORM4 = ENTITY-MIB::entPhysicalDescr

I have an old script that retrieves a platform using OID entPhysicalModelName which works just fine for the devices above. Perhaps you should try that instead?

EDIT 2

If you have bash and the Net-SNMP package correctly installed you could try something along the lines of this:

#   Get device model; OIDs 1,1000,or 1001 will have a switch (stack) member

read -a model <<<$(snmpbulkget -v2c -r2 -t1 -c $COMMUNITY $SWITCH ENTITY-MIB::entPhysicalModelName -Oqv)

printf "$model\n"

\ EDIT 3**

Using the snmp command above I get these:

~ $ snmpbulkget -v2c -r2 -t1 -c <Community> sfh-103-b ENTITY-MIB::entPhysicalModelName -Oqv
WS-C6509-E

~ $ snmpbulkget -v2c -r2 -t1 -c <Community> sfh-strack-a ENTITY-MIB::entPhysicalModelName -Oqv
WS-C3850-48P
WS-C3850-48P

~ $ snmpbulkget -v2c -r2 -t1 -c <Community> conf-loc2-b ENTITY-MIB::entPhysicalModelName -Oqv
WS-C4506-E

~ $

SNMPBULKGET will get all those empty values afterwards so you'll definitely need to trim them.

A 4-member C3850 stack would return this:

WS-C3850-48P
WS-C3850-48P
WS-C3850-48P
WS-C3850-48P

(Just so you know that the SFH-STRACK-A isn't reporting a duplicate)

MJL85 commented 9 years ago

I started using the same MIB for stack info, works great. I also added a bit at the very end to check all the nodes to verify we pulled the serial, IOS, and platform. if not it uses your MIB to pull them direct. So we prefer CDP since it's faster and we assume reliable, and if we don't get it there then we poll it direct with SNMP. I just committed the change and it seems to work well.