Open abdullahdevrel opened 2 months ago
I have never been entirely happy about the Maxmind EULA situation, but a number of Linux distros ship the database as packages so I figured it would be fine. Basically a "better to ask forgiveness than permission"-type situation.
Your databases seem way larger; "IP to Country Database" is ~38M. That's far to large to include in the GoatCounter binary. The "Geolite countries" is ~3.7M. I don't know why it's so much larger? People can already use any mmdb database they want with the -geodb
flag, but I also want a basic "good enough" database built in.
Thank you for reviewing the request.
I have never been entirely happy about the Maxmind EULA situation, but a number of Linux distros ship the database as packages so I figured it would be fine. Basically a "better to ask forgiveness than permission"-type situation.
The challenge is that they explicitly have a commercial distribution license for these free databases, so I am not sure what the consequences of this are, to be honest. I am not sure if those Linux distros have their own licensing terms with them that permit the distribution like that.
Your databases seem way larger; "IP to Country Database" is ~38M.
That is because our database provides full accuracy. The accuracy extends down to the individual IP level, even for a country database. When you download an IP database, compromise happens in two ways: with infrequent updates and range clustering. However, because we are providing full accuracy, the resulting database is larger.
Another idea is that since you can download the database directly via a URI, users can download it during installation. This will eliminate the need to package it with a database in the first place within the binary. Also, this download mechanism can support database updates as well.
People can already use any mmdb database they want with the -geodb flag, but I also want a basic "good enough" database built in.
On a cursory view, it seems like the lookup mechanism is not database agnostic, but I could be wrong. There are structural differences between our database and MaxMind's (https://ipinfo.io/blog/migrating-from-maxmind-to-ipinfo/). Mainly:
geoname_id
and a complementary geoname databaseLet me know what you think.
I want GoatCounter to be a "Just Works" binary without external dependencies, so people can easily self-host with a minimum of fuss. Dealing with GeoIP database downloads rather goes against that.
I don't mind providing compatibility with it, but I don't think it will be the default if it's so much larger.
However, if I try to use it, it errors out with:
maxminddb: cannot unmarshal EU into type struct { Names map[string]string "maxminddb:\"names\""; Code string "maxminddb:\"code\""; GeoNameID uint "maxminddb:\"geoname_id\"" }
So I guess the database structure is different.
I don't want to "migrate to" anything, I want to be compatible with both. I don't understand why you don't just provide a "Maxmind-compatible database" as an option.
Going from country = maxmind_data['country']['iso_code']
to country = ipinfo_data['country']
is a silly change and it doesn't really matter all that much which one is used. Maybe one is marginally better, but not at least providing a compatible database is rather lacking in pragmatism.
Thank you for reviewing. I understand that MaxMind's database is deeply integrated into the project and would require some engineering investment to adopt. We tried our best to provide the simplest and best data to use out there. Because of the ease of use and the quality of the data, it usually justifies making the engineering investment to adopt.
Due to the unpredictable nature of MaxMind's database structure, you have to wrap every call to get a value in switch
/case
statements. In our case, if we do not have the data, we simply return an empty string. Making a drop-in MaxMind integration compatible database would essentially be a compromise, in my personal opinion, as you have to create a nested version of the database, which will increase its size.
Hi,
I work for IPinfo, but I have been using Goatcounter for my personal projects for several years and have been exploring self-hosting it recently.
I would like to request the integration of the IPinfo IP to Country or IP to Country ASN/ISP database for Goatcounter. I believe that from a development philosophy, IPinfo’s free IP database is perfect for Goatcounter. Additionally, there are technical benefits as well.
Goatcounter specific benefits
Binary distribution issues and "MaxMind®️'s EULA"
Even though I have not made progress in selfhosting it, but I believe the binary file includes MaxMind’s country database which actually creates a tricky situation. As far I know they do not allow redistribution of their database even the free database. They have an EULA that requires users to download their own database using their access tokens
The value proposition of IPinfo's database is that it is simply CC-BY-SA 4.0. You can do whatever you want with it as long as you give attribution. Commercial usage is allowed as well. Librespeed is using our data by packaging it directly in the repo: https://github.com/librespeed/speedtest/issues/641#issuecomment-2254375165
ASN/ISP data
You have mentioned that city-level data is too granular, so maybe you can add the ASN/ISP data from the IP to Country ASN database as an additional data source. The ASN/ISP detection is based on network routing data.
Our country-level data, even though free, is a zero-compromise, fully accurate database. We support daily updates and offer range clustering. It is just a pure subset of our IP geolocation database, without the more granular location information and only provides country level data.
General Technical benefits
The database has the following features:
Database schema
start_ip
end_ip
country
country_name
continent
continent_name
asn
as_name
as_domain
Documentation: https://ipinfo.io/developers/ip-to-country-asn-database
Samples are available here: https://github.com/ipinfo/sample-database/tree/main/IP%20to%20Country%20ASN
The database can be downloaded simply by accessing the storage URI with an access token.
My apologies for the wall of text. Let me know what you think. Thank you!