mteodoro / mmutils

Tools for working with MaxMind GeoIP csv and dat files
MIT License
95 stars 47 forks source link

How can I specify the encoding? #11

Open Morriaty-The-Murderer opened 8 years ago

Morriaty-The-Murderer commented 8 years ago

Hi

My csv file contains characters (isp names) out range of ascii. After transforming, my application could read the dat file, but result in unreadable code.

Thank you for help.

mnicky commented 6 years ago

It works for me with utf-8:

$ file -i GeoLiteCityv6.tst.csv
GeoLiteCityv6.tst.csv: text/plain; charset=utf-8

$ head GeoLiteCityv6.tst.csv
2001:5::,2001:5:ffff:ffff:ffff:ffff:ffff:ffff,42540488558116655331872044393019998208,42540488637344817846136381986563948543,EU,,Île-de-France,47.0000,8.0000,,0,0

$ ./csv2dat.py -w test.dat -l GeoIPCity-134euw-Location.csv mmcity6 GeoLiteCityv6.tst.csv

$ python
>>> import GeoIP
>>> r=GeoIP.open('test.dat', 0)
>>> r.record_by_addr_v6('2001:5::')
{'city': '\xc3\x8ele-de-France', 'region_name': None, 'region': None, 'area_code': 0, 'time_zone': None, 'longitude': 8.0, 'metro_code': 0, 'country_code3': 'EU', 'latitude': 47.0, 'postal_code': None, 'dma_code': 0, 'country_code': 'EU', 'country_name': 'Europe'}
>>> print r.record_by_addr_v6('2001:5::')['city'].decode('utf-8')
Île-de-France