censortracker / censortracker_backend

The simple backend for Censor Tracker
MIT License
10 stars 4 forks source link

Don't store user IP at all #9

Open komachi opened 4 years ago

komachi commented 4 years ago

There is no need to store user IP at all, even temporally. You could utilize https://github.com/hadiasghari/pyasn for offline transforming IP to AS number on-the-fly, and store AS number in db. It has the same benefits as your current solution (as I understand, you save full client ip addr to db, and every 3 hours convert ip addr to client hash, region and ISP info by http (this for sure is to be fixed) request to separate proprietary SaaS service, correct me if I wrong).

pyasn include scripts to download raw data from http://archive.routeviews.org. An alternative could be parsing http://thyme.apnic.net data. So strore ASN in db. To get info about ASN, pyasn include script to parse http://www.cidr-report.org/as2.0/autnums.html Also there is maxmind's db https://dev.maxmind.com/geoip/geoip2/geolite2-asn-csv-database/ and relevant python lib. Also you could whois (or event rdap with https://rdap.arin.net/registry/autnum/<ASN>).

Anyway, having ASN is more useful in tracking censorship events, and this conversion could be done entirely offline and on-the-fly, protecting user privacy better.

msva commented 3 years ago

Unfortunatelly, having only ASN is not enough for duplicates detection