beyond-all-reason / teiserver

Middleware server for online gaming
https://www.beyondallreason.info/
MIT License
59 stars 52 forks source link

Swap `geoiplookup` + manual `.dat` lookup dependency to a proper GeoIP database #138

Open suumpmolk opened 1 year ago

suumpmolk commented 1 year ago

Hi, although this may not be a high priority issue this is something that I would like to at least start a discussion about since I'm one of the players affected by it.

Problem: When a user signs up the country lookup based on IP may be incorrect and will store the wrong country.

I have given this some thought and understand that the problem is non-trivial. One solution would be to update the GeoIP database regularly to have up to date information and provide a way to update existing users country.

Another alternative of course is to allow users to manually set country flag.

A suggestion on how to automate the GeoIP database

I looked at various IP databases and Maxmind seems to be the most established. Their GeoLite2 Free databases are updated twice weekly and provide both downloads and a web service (limited to 1000 lookups per day). To avoid any lookup limitations I tried the download solution.

Signing up was hassle-free. Got access immediately and could download their GeoIP.conf with license and account id.

With regular geoiplookup:

$ geoiplookup MY-IP
GeoIP Country Edition: KZ, Kazakhstan

Using GeoLite2:

(save GeoIP.conf in /etc/GeoIP.conf)
$ sudo apt install geoipupdate
$ sudo geoipupdate
$ pip install geoip2
$ python3
>>> import geoip2.database
>>> with geoip2.database.Reader('/var/lib/GeoIP/GeoLite2-Country.mmdb') as reader:
...     response = reader.country('MY-IP')
>>> response.country.iso_code
'SE'
>>> response.country.name
'Sweden'

Updating the database can be done with crontab, as suggested by Maxmind. 32 22 * * 1,3 /usr/local/bin/geoipupdate

I used python here but there are multiple supported client APIs.

This only addresses the first part of the issue, and there is a caveat which is that Maxmind requires attribution for using GeoLite2. I'm not sure if that is a deal breaker or not but it's understandable if it is. There are other providers but they seem to use the same license.

Any thoughts on this? Any alternative solutions to the problem?

Teifion commented 1 year ago

This looks like a nice solution, I'll have a go at trying it :)

Teifion commented 1 year ago

I've run the update though it's still not perfect (I tested it on a known bad lookup and it still shows incorrectly).

suumpmolk commented 1 year ago

That's disappointing. Hard to tell how much better it is without having a large dataset of known bad lookups.

I noticed Elixir wasn't part of their official APIs. Did you try the unofficial one?

Teifion commented 1 year ago

I've been using the command line calls from Elixir, seemed the easiest method.

suumpmolk commented 1 year ago

I've been using the command line calls from Elixir, seemed the easiest method.

GeoLite2 uses a binary database (MMDB) and requires one of their APIs to read it or their beta cmd line tool mmdbinspect

The geoiplookup command is not compatible with GeoLite2

abdullahdevrel commented 5 months ago

Would it be possible to switch to IPinfo's free IP to Country database?

https://ipinfo.io/developers/ip-to-country-database

FIELD NAME EXAMPLE DATA TYPE DESCRIPTION
start_ip 217.220.0.0 TEXT Starting IP address of an IP address range
end_ip 217.223.255.255 TEXT Ending IP address of an IP address range
country IT TEXT ISO 3166 country code of the location
country_name Italy TEXT Name of the country
continent EU TEXT Continent code of the country
continent_name Europe TEXT Name of the continent

I came across a user location accuracy issue and I think our free database would be perfect for the game.

Moreover, for accuracy or correction issues, we have an active support team, and I am active on Reddit and in our community and can talk with users directly as well.

Please let me know what you think. Thank you.

p2004a commented 5 months ago

Hey, @abdullahdevrel thank you for chiming in, I think the answer is maybe?

I think it's impossible for us to evaluate the claim that it's better accuracy then MaxMind's GeoLite2 we use at the moment, but I would be interested in trying it out, especially if the claim in Reddit

and our correction process is faster

is true.

Currently teiserver is using old geoiplookup binary, that requires data in the old legacy dat format, we fetch them from https://mailfud.org/geoip-legacy/ like many Linux distributions e.g. Debian https://packages.debian.org/bookworm/geoip-database.

So first hurdle would be to get the MMDB format to geoiplookup compatible DAT file...

abdullahdevrel commented 5 months ago

@p2004a Thank you for reviewing the request.

I think it's impossible for us to evaluate the claim that it's better accuracy then MaxMind's GeoLite2

We always prefer evidence. For starters, we have this page that compares our data with some competitor data in the industry: https://ipinfo.io/accuracy

Our free IP to Country dataset is technically a "premium" dataset as it provides full accuracy with daily updates. Geolite tends to make intentional compromises in accuracy and updates twice a week. While our free dataset is essentially just a subset of our geolocation data, it is considered "premium" right out of the box.

Our data accuracy is better because we operate a 700-server strong measurement network infrastructure that uses latency-based IP geolocation, while traditional IP geolocation companies heavily rely on self-reported internet data. You can learn more about this in this article: https://ipinfo.io/blog/probe-network-how-we-make-sure-our-data-is-accurate

From a user's perspective, these are just contexts around accuracy. To verify the accuracy, you have to invest some amount of time in going through the data: https://community.ipinfo.io/t/consensus-does-not-equate-to-accuracy-verify-the-ip-location-yourself/5519

We understand that, which is why we have pitched our user support and community as well.

The correction process is entirely automated. Users can submit corrections here: https://ipinfo.io/corrections. It takes a day or two to validate the report, and then if it passes our checks, we merge it with our database. As the database is updated daily, there should be no issues getting readily updated location corrections.

We have a great support team and a strong community presence. I am available on Twitter and Reddit. We also have a community platform through which our team is accessible to any user issues.

So first hurdle would be to get the MMDB format to geoiplookup compatible DAT file...

Unfortunately, we do not offer the legacy DAT format file. We have mmdb, csv, and json formats available. The MMDB file is supported by all MMDB reader libraries.


Thank you again for reviewing the request. Please check out the dataset and let me know what you think or how I can help.

PS: If you are planning to incorporate bot detection or a ban mechanism, I would recommend checking the IP to Country ASN variant of the free database, as the ASN data is quite useful.

p2004a commented 5 months ago

Thank you again @abdullahdevrel!

I think it's at the moment blocked by switching from using the DAT format to using the proper industry standard format.

I see there are some libraries in Elixir like https://hexdocs.pm/locus so we should be able to do it once one of our volunteers picks it up :)