mteodoro / mmutils

Tools for working with MaxMind GeoIP csv and dat files
MIT License
95 stars 47 forks source link

csv2dat.py lhs tuple error - how to debug where the fault is? #6

Closed jhaar closed 8 years ago

jhaar commented 8 years ago

Hi there

I want to use csv2dat.py to merge into GeoIP geographic locations for our 10/8 network (200 sites, I've got lat/long/city by subnet)

So I duplicated the CSV format of GeoLiteCity-Location.csv/GeoLiteCity-Blocks.csv appended my data - but csv2dat crashes as follows when I run it. BTW, if I run it on the original GeoIP csv files, it works fine, and I even made a 10-line version where I hand-picked some data points from GeoIP plus mine and that worked too. I suspect there's a simple bad char or something in my much larger data set - but it's too big to parse by eyeball. How could I add some debugging code to at least tell me what line in locations/blocks is causing the fault?

Here's the runtime error. Thanks!

csv2geoipDAT.py -w mmcity.dat -l locations.csv mmcity blocks.csv Traceback (most recent call last): File "/usr/local/bin/csv2geoipDAT.py", line 475, in rval = main() File "/usr/local/bin/csv2geoipDAT.py", line 471, in main return cmd(opts, args) File "/usr/local/bin/csv2geoipDAT.py", line 439, in build_dat r.load(opts, args) File "/usr/local/bin/csv2geoipDAT.py", line 196, in load self[net] = data File "/usr/local/bin/csv2geoipDAT.py", line 170, in setitem if not node.lhs: AttributeError: 'tuple' object has no attribute 'lhs'

mteodoro commented 8 years ago

I'm away from a computer but I think you might be trying to insert two overlapping IP address ranges. You can use the -d flag to debug, and I've had good luck with the bisect method: split your input file in half, and keep doing that with the half that fails until you have something small enough to eyeball. If it's not sensitive you can post the failing chunk here too and I'll take a look.

jhaar commented 8 years ago

thanks for that - I did just as you said and eventually found some external IPs in there. I had assumed (always a bad thing) that all the addresses were in 10/8 - but we had details about one "real" Internet range that we owned in there too - and that caused the problem

Thanks again!