hadiasghari / pyasn

Python IP address to Autonomous System Number lookup module. (Supports fast local lookups, and historical lookups using archived BGP dumps.)
Other
275 stars 71 forks source link

Modify mrtx to parse entire MRT file for a single record error #28

Closed sanketsharma411 closed 7 years ago

sanketsharma411 commented 7 years ago

So, right now, in mrtx.parse_mrt_file(..), if a single record has an error, the whole file gets skipped. I guess it would help to modify the function such that a single line error does not skip the rest of the lines.

The issue can be re-created as

$ curl -O http://data.ris.ripe.net/rrc06/2014.01/bview.20140112.1600.gz
$ gunzip bview.20140112.1600.gz
$ bzip2 bview.20140112.1600
$ pyasn_util_convert.py --single bview.20140112.1600.bz2 bview.20140112.1600.db
hadiasghari commented 7 years ago

@sanketsharma411 that is a valid point. I've been thinking about how to do better error handling and debugging in that function.

My main concern was not to sweep under the rug major problems (e.g. if the downloaded file is corrupt, if the file format has changed), and also to avoid cascading errors . Perhaps a threshold could be set on how many records to ignore before quitting.

sanketsharma411 commented 7 years ago

@hadiasghari yeah, that makes sense. But I am not entirely sure about how we plan to identify corrupt file, or different file format problems. Maybe we can add a boolean to mrtx.parse_mrt_file(..., skip_record_on_error = False) that allows users to skip records with some problems. Making it clear that with this argument set, pyasn will try its best to parse the file, and simply ignore errors.

hadiasghari commented 7 years ago

@sanketsharma411, could you do a pull request again? thanks

sanketsharma411 commented 7 years ago

@hadiasghari #38

hadiasghari commented 7 years ago

Merged.