hadiasghari / pyasn

Python IP address to Autonomous System Number lookup module. (Supports fast local lookups, and historical lookups using archived BGP dumps.)
Other
292 stars 72 forks source link

IOError: invalid data stream #23

Closed alvaromlg closed 8 years ago

alvaromlg commented 8 years ago

Hello,

I downloaded a rib file with: pyasn_util_download.py --latest

Then I tried to convert it into a .dat with: pyasn_util_convert.py --single rib.20160411.1400 ipasn_20160411.dat

And then it crashed with:

MRT RIB log importer 1.5.0b7 Traceback (most recent call last): File "/home/apelegrina/.virtualenvs/test/bin/pyasn_util_convert.py", line 53, in dat = mrtx.parse_mrt_file(f, print_progress=print_progress) File "/home/apelegrina/.virtualenvs/test/lib/python2.7/site-packages/pyasn/mrtx.py", line 64, in parse_mrt_file mrt = MrtRecord.next_dump_table_record(mrt_file) File "/home/apelegrina/.virtualenvs/test/lib/python2.7/site-packages/pyasn/mrtx.py", line 208, in next_dump_table_record buf = f.read(header_len) # read table-header IOError: invalid data stream

Am I missing something?

Thanks

hadiasghari commented 8 years ago

What is the size of the downloaded file (rib.20160411.1400)? Could you try downloading again?

alvaromlg commented 8 years ago

It's 1.5G

-rw-rw-r--. 1 apelegrina apelegrina 1,5G abr 12 15:50 rib.20160411.1400

I will try again

alvaromlg commented 8 years ago

It crashes again:

MRT RIB log importer 1.5.0b7 Traceback (most recent call last): File "/home/apelegrina/.virtualenvs/test/bin/pyasn_util_convert.py", line 53, in dat = mrtx.parse_mrt_file(f, print_progress=print_progress) File "/home/apelegrina/.virtualenvs/test/lib/python2.7/site-packages/pyasn/mrtx.py", line 64, in parse_mrt_file mrt = MrtRecord.next_dump_table_record(mrt_file) File "/home/apelegrina/.virtualenvs/test/lib/python2.7/site-packages/pyasn/mrtx.py", line 208, in next_dump_table_record buf = f.read(header_len) # read table-header IOError: invalid data stream

hadiasghari commented 8 years ago

The size of the MRT/RIB file is very strange. They are typically under 100MBs. Perhaps RouteViews has changed something. I will have a look.

hadiasghari commented 8 years ago

Hi @alvaromlg . I just downloaded the same file manually (from http://archive.routeviews.org/bgpdata/2016.04/RIBS/), and it is only only 98MBs! It also converts without an issue.

Is there perhaps some reason that the RIB file wouldn't download correctly in your environment?

$ ls -lh rib* -rw-r--r-- 1 root root 98M Apr 22 16:15 rib.20160411.1400.bz2

$ pyasn_util_convert.py --single rib.20160411.1400.bz2 ipasntest.dat MRT RIB log importer 1.5.0b7 parse_mrt_file(): starting parse for MrtTable(ts:1460383200, type:13, sub-type:1, data-len:918, seq:None, prefix:None) MRT record 100000 @11s MRT record 200000 @23s MRT record 300000 @34s MRT record 400000 @46s MRT record 500000 @59s MRT record 600000 @71s IPASN database saved (634710 prefixes)

alvaromlg commented 8 years ago

mm yeah in your log I see the problem, I was uncompressing .bz2 file... now is working, thanks for your time

nagarjung commented 7 years ago

Hi @hadiasghari, I have mrt dump from my openbgp router instead of depending on routeviews. I am trying to convert the mrt file to ipasn data file using the utility provided by you.

pyasn_util_convert.py --single rib-dump-v2-1800 test.dat
MRT RIB log importer 1.5.0b7
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 53, in <module>
    dat = mrtx.parse_mrt_file(f, print_progress=print_progress)
  File "/usr/local/lib/python2.7/dist-packages/pyasn/mrtx.py", line 64, in parse_mrt_file
    mrt = MrtRecord.next_dump_table_record(mrt_file)
  File "/usr/local/lib/python2.7/dist-packages/pyasn/mrtx.py", line 208, in next_dump_table_record
    buf = f.read(header_len)  # read table-header
IOError: invalid data stream

But I am able to donwload and convert the rib file from routeviews which are in .bz2 format (rib.20161014.0400.bz2)

So Is it possible to convert the mrt dumps from our own openbgp router to IPASN data files ?

Also I am not aware that dumps from routeviews or other sources are always available.

Also tried compressing MRT dump to bz2 and then converting to IPASN data file too fails.

nagarjung@PRINHYLTPHP0848:~/pyasn_demo$ pyasn_util_convert.py --single rib-dump-v2-1800.bz2 test.dat
MRT RIB log importer 1.5.0b7
parse_mrt_file(): starting  parse for MrtTable(ts:1471955445, type:13, sub-type:1, data-len:58, seq:None, prefix:None)
  MRT record 100000 @3s
  Error parsing prefix 'xxx.xxx.xxx.xxx/24'
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 53, in <module>
    dat = mrtx.parse_mrt_file(f, print_progress=print_progress)
  File "/usr/local/lib/python2.7/dist-packages/pyasn/mrtx.py", line 83, in parse_mrt_file
    origin = mrt.as_path.origin_as
  File "/usr/local/lib/python2.7/dist-packages/pyasn/mrtx.py", line 443, in origin_as
    assert self.pathsegs[0].seg_type == self.BgpPathSegment.AS_SEQUENCE
IndexError: list index out of range

Thanks

hadiasghari commented 7 years ago

Hi @nagarjung , perhaps it's best to open a new issue for this question.

In general, pyasn_util_convert currently expects a bz2 (or with the new master, also gzip) file. That's the reason for the first error.

The second error relates to your specific dump file, and there might be reasons it varies slightly from the RouteViews or RIPE files. The error indicates issues with parsing one record of the MRT dump. We are adding code to master that allows to skip records that have parsing errors, but continue the whole file. That could be one solution, but the risk is you would lose information from those records.

Finally, what do you mean by "I am not aware that dumps from routeviews or other sources are always available"?

-Hadi

nagarjung commented 7 years ago

Hi @hadiasghari , thanks for the reply.

In one of our internal project we are collecting BGP dumps from our own BGP router and trying to analyse the bad bot traffic and do more stuff.

"Finally, what do you mean by "I am not aware that dumps from routeviews or other sources are always available"?"

Also Is there a way we can get information about blacklisted IP prefixes and ASN's from these dumps or from any other sources.

Thanks

hadiasghari commented 7 years ago

Yes, Routeviews and RIPE are global, and contain complete v4 and v6 routing information. They are also historical -- you can see the view at different points in the past. We have used them in many measurement projects. They do not however include blacklist information and the such.