NullHypothesis / exitmap

A fast and modular scanner for Tor exit relays. The canonical repository (including issue tracker) is at https://gitlab.torproject.org/tpo/network-health/exitmap
https://www.cs.kau.se/philwint/spoiled_onions/
GNU General Public License v3.0
454 stars 107 forks source link

Remove brittle GeoIP code #6

Closed NullHypothesis closed 9 years ago

NullHypothesis commented 10 years ago

The GeoIP code is slow and comes with licensing weirdness. Instead, Maxmind's API can be used or Onionoo can be queried with the country option: https://onionoo.torproject.org/protocol.html

radman404 commented 9 years ago

I found it was best to use fingerprints from the hosts you want to locate instead of IP addresses. def country_code(fingerprint):

req = fingerprint
jsonreturned = urllib2.urlopen("https://onionoo.torproject.org/details?fingerprint=%s"% req)
response = json.loads(jsonreturned.read())
value = response['relays'][0]['country'].upper()
return value

I added this to my ip2loc.py I just can't get relayselector.py to pass fingerprints to this like it did for IP addresses. Let me know if you try it and get it working - don't think it'll be too hard for someone with more knowledge of the code than myself.

radman404 commented 9 years ago

After trying to get to sleep I realized the problem with this code and having to request a lot of fingerprints from the server. Will re-write to grab country nodes like you mentioned, how would you want the IP:PORT returned? list or dict?

radman404 commented 9 years ago

I have come up with the following code, which returns a dictionary of ip port back. let me know what you think. And if this is on the right track. def country(code): host = {} req = code.lower() jsonreturned = urllib2.urlopen("https://onionoo.torproject.org/details?country=%s"% req) response = json.loads(jsonreturned.read()) for i in range(len(response['relays'])): iplist= response['relays'][int(i)]['or_addresses'] for e in iplist: try: ip, port = e.split(':') host[ip] = port except: #probably ipv6 or not an IP address print "probably ipv6 skipping.." return host

Thanks, Kyle.

NullHypothesis commented 9 years ago

Thanks a lot for working on this, Kyle. I'm currently travelling but I will get back to you in a couple of days.

radman404 commented 9 years ago

No problem, enjoy your time away!

radman404 commented 9 years ago

Have you had a chance to look at this issue?

NullHypothesis commented 9 years ago

Yes, I think your code should work. I'm currently refactoring it and I hope to have a patch by tomorrow. Thanks for working on this.

NullHypothesis commented 9 years ago

Commit https://github.com/NullHypothesis/exitmap/commit/4a7bd69bc3992bff1232dfde7628e9fa4f369c78 uses a modified version of your code to fix this issue. What do you think?

radman404 commented 9 years ago

Excellent, Phillipp. Glad I could help! Also, using fingerprints makes much more sense - don't have to deal with ipv6 addresses and my ".split(':')" way of dealing with the ipv4(proof of my inexperience). Thanks for informing me of the changes.

On Sun, Nov 9, 2014 at 11:51 PM, Philipp Winter notifications@github.com wrote:

Commit 4a7bd69 https://github.com/NullHypothesis/exitmap/commit/4a7bd69bc3992bff1232dfde7628e9fa4f369c78 uses a modified version of your code to fix this issue. What do you think?

— Reply to this email directly or view it on GitHub https://github.com/NullHypothesis/exitmap/issues/6#issuecomment-62326176 .

NullHypothesis commented 9 years ago

OK, merged. Thanks for your help, radman404!