hadiasghari / pyasn

Python IP address to Autonomous System Number lookup module. (Supports fast local lookups, and historical lookups using archived BGP dumps.)
Other
292 stars 72 forks source link

AssertionError while converting RIB #39

Closed agrepravin closed 7 years ago

agrepravin commented 7 years ago

Got AssertionError while converting rib.20161125.0600.bz2


Traceback (most recent call last):
  File "/usr/bin/pyasn_util_convert.py", line 5, in <module>
    pkg_resources.run_script('pyasn==1.5.0b7', 'pyasn_util_convert.py')
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 461, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 1194, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/lib64/python2.6/site-packages/pyasn-1.5.0b7-py2.6-linux-x86_64.egg/EGG-INFO/scripts/pyasn_util_convert.py", line 75, in <module>
    dat = mrtx.parse_mrt_file(f, print_progress=print_progress)
  File "/usr/lib64/python2.6/site-packages/pyasn-1.5.0b7-py2.6-linux-x86_64.egg/pyasn/mrtx.py", line 96, in parse_mrt_file
    assert mrt.type == mrt.TYPE_TABLE_DUMP
AssertionError
hadiasghari commented 7 years ago

Hi, can you make sure the file has been downloaded correctly? What is the filesize? In rare cases there is an issue with an RIB file, and using another file from the same day solves the issue.

Nixtren commented 7 years ago

Got same issue with http://archive.routeviews.org/route-views4/bgpdata/2016.11/RIBS/rib.20161130.1800.bz2

$ md5sum rib.20161130.1800.bz2
295c96254eb48b6e18ac91f0679ce722  rib.20161130.1800.bz2
pyasn_util_convert.py --single rib.20161130.1800.bz2 rib.dat
  File "/usr/local/bin/pyasn_util_convert.py", line 53, in <module>
    dat = mrtx.parse_mrt_file(f, print_progress=print_progress)
  File "/usr/local/lib/python2.7/dist-packages/pyasn/mrtx.py", line 89, in parse_mrt_file
    assert mrt.type == mrt.TYPE_TABLE_DUMP
AssertionError

Same result testing other RIBs from the same month (http://archive.routeviews.org/route-views4/bgpdata/2016.11/RIBS/). Tested the following file from month 9, everything seems to be working fine: http://archive.routeviews.org/route-views4/bgpdata/2016.09/RIBS/rib.20160901.0000.bz2

hadiasghari commented 7 years ago

Thank you for reporting this @Nixtren . I can also replicate the error, and I'll have a look at it to check for the cause. As a temporary workaround: if you don't need IPv6 routes, use the route-views2 RIBs (http://archive.routeviews.org/bgpdata/2016.11/RIBS/), which convert fine.

janrueth commented 7 years ago

Hi, I can also confirm the issue. There is a comment near the assertion

# in TD2, no prefix appears twice. (probably because we use *only entry 0 of records* -- is this ok?)
# in TD1, they do, "but we are only interested in getting the first match" (quote from asn v1.2)
#          for one TD1 dump checked: all duplicate prefixes had same origin (we don't assert all for speed)

However, it appears that there are a lot of duplicates. I checked some of them and it turns out that their underlying tables are different in that the first one has a lot more peers and the second one has less. Could it be that this AS is in some kind of transition and not finished?

I suggest to not fail the program, but rather just throw a warning in case the fields captured by pyasn are similar and only fail if they are different.

Adding this to pyasn and evaluating rib.20161227.0600.bz2 shows that there are actually 4986 duplicated with the same prefix to AS mapping and 3 duplicates where the mapping changes (i.e., errors)

I changed mrtx.py near line 90 from:

        else:
            assert mrt.type == mrt.TYPE_TABLE_DUMP

to:

        else:
            if mrt.type == mrt.TYPE_TABLE_DUMP_V2:
                # this entry already exists...
                previous = results[mrt.prefix]
                print("  {}, prefix {} is duplicated, previous: {}, new: {}".format("Error" if previous != mrt.as_path.origin_as else "Warning", mrt.prefix, previous, mrt.as_path.origin_as), file=stderr)
                assert previous == mrt.as_path.origin_as
            else:
                assert mrt.type == mrt.TYPE_TABLE_DUMP

I'm not sure why this happens... but I suspect that there is actually something wrong with the generation of the rib file.

hadiasghari commented 7 years ago

I plan to look into this after the new year, thanks for sending more details.

hadiasghari commented 7 years ago

Guys, I have updated master to version 1.5.0b8, which should solve the problem identified here.

Please test and report back to me if there are any problems still. I will close the issue in a few days.

Also happy 2017 :)

PS: @Eichhoernchen indeed you were correct, thanks.

janrueth commented 7 years ago

I'm still a bit confused, I readded the part of the code that traverses all peer entries of a table entry... some have inconclusive data added to them, some routers seem advertise false prefixes, at least some of them had multiple AS that pointed to the same prefix. Wouldn't it be better to return a set of ASs in these cases?

hadiasghari commented 7 years ago

@Eichhoernchen this is a great question, and possibly relates to issue #37. (If you agree, let's move the discussion there).

I don't have a clear answer for your question yet. Ideally we wouldn't want to return all the ASes pointing to the same prefix, but the best one that the BGP router matches (if such a thing makes sense).

If you use the latest code from master, there is a new '--dump' switch to 'pyasn_util_convert.py' to help with debugging this scenario. I've provided a sample output below.

$ pyasn_util_convert.py --dump rib.20170102.1400.bz2 --limit 5
Dumping MRT/RIB archive to screen:

Record #000001: MrtTD2Record (PEER-INDEX-TABLE, collector 2162111334, 68 peers)

Record #000002: MrtTD2Record (IPV4-UNICAST 0.0.0.0/0, 6 entries)
     Entry 01 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[34224, 3257]
              BGPAttribute(NEXT_HOP): 1587346450
              BGPAttribute(MULTI_EXIT_DISC): 0
              BGPAttribute(COMMUNITIES): 2242904397
     Entry 02 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[18106]
              BGPAttribute(NEXT_HOP): 3393792045
     Entry 03 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[20771, 3356]
              BGPAttribute(NEXT_HOP): 1358016543
     Entry 04 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[31019, 39326]
              BGPAttribute(NEXT_HOP): 1541707521
     Entry 05 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[58511]
              BGPAttribute(NEXT_HOP): 1744241453
              BGPAttribute(MULTI_EXIT_DISC): 0
     Entry 06 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[47872, 3356]
              BGPAttribute(NEXT_HOP): 3106698241
              BGPAttribute(COMMUNITIES): 20 bytes
     => pyasn choice: AS 3257

Record #000003: MrtTD2Record (IPV4-UNICAST 1.0.0.0/24, 1 entries)
     Entry 01 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[20771, 47872]
              BGPAttribute(NEXT_HOP): 1358016543
     => pyasn choice: AS 47872

Record #000004: MrtTD2Record (IPV4-UNICAST 1.0.4.0/24, 42 entries)
     Entry 01 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[8492, 20764, 3216, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 1433534681
              BGPAttribute(COMMUNITIES): 40 bytes
Entry 02 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[1239, 4637, 4637, 4637, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 2430923138
              BGPAttribute(MULTI_EXIT_DISC): 225
     Entry 03 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[6762, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 3273054396
              BGPAttribute(MULTI_EXIT_DISC): 100
              BGPAttribute(COMMUNITIES): 443154463
     Entry 04 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[53364, 3257, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 2915908074
              BGPAttribute(COMMUNITIES): 20 bytes
     Entry 05 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[34224, 5580, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 1587346450
              BGPAttribute(MULTI_EXIT_DISC): 0
              BGPAttribute(COMMUNITIES): 12 bytes
     Entry 06 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[1668, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 1119453185
              BGPAttribute(MULTI_EXIT_DISC): 162
     Entry 07 BGPAttribute(ORIGIN): 0
              BGPAttribute(AS_PATH): path-sequence[7018, 209, 4637, 1221, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 38803, 56203]
              BGPAttribute(NEXT_HOP): 201326911
              BGPAttribute(COMMUNITIES): 8 bytes
<CUT>
janrueth commented 7 years ago

Ah great, I created a similar (not so clean looking) output myself. Okay lets move the discussion there.

hadiasghari commented 7 years ago

done, closing this, let's continue there

askabelin commented 7 years ago

Hi. Could you please make release to pypi?

hadiasghari commented 7 years ago

@askabelin done!