Closed abdullahdevrel closed 9 months ago
Hi there @abdullahdevrel thanks for following up!
Trippy currently reads City
data from mmdb
files and so I think users would need to have the (premium?) IP Geolocation Extended to get that?
I tried downloading the sample in mmdb
format but I was not able to get City
data from it using the maxminddb
crate, perhaps it does not support the ipinfo flavour of mmdb
file?
Test code:
use std::net::{IpAddr, Ipv4Addr};
use maxminddb::geoip2::City;
fn main() {
let reader = maxminddb::Reader::open_readfile("ip_geolocation_extended_ipv4_sample.mmdb").unwrap();
let addr = IpAddr::V4(Ipv4Addr::from([50, 220, 147, 113]));
let city_data = reader.lookup::<City<'_>>(addr);
println!("{city_data:?}");
}
Fails with Err(DecodingError("invalid type: string \"Royal Oak\", expected struct City"))
(I tried decoding as Country
as well)
Perhaps this is what you mean by the data being "flat"? Perhaps I have to deserialise to a custom struct with the "flat" structure? Is there a recommend mmdb
reader crate for Rust that support ipinfo
flavour of mmdb files?
You can package our free IP to the Country ASN database with the project
I'd prefer to allow user to bring their own files rather than bundle it, to keep size down and also to prevent stale data being used.
For that, we will provide an access token that you can use
I'm not quite sure what this is for, presumably the token is used for looking up the ipinfo API? I see you have an API for that. Would the token be something that could be bundled in Trippy for all users or just for development use? I prefer user-provided mmdb
files over API access as Trippy will often be used in data centre environment with no external internet access.
By using the IPinfo dataset, you can get both country-level geolocation information and ASN information from a single source.
Just to note that Trippy currently get ASN data from the IP to ASN Mapping Service provided by Team Cymru via DNS TXT records, so it's mostly GeoIp (country, city, lat/long) that are needed.
With some trial and error I was able to figure out it is a HashMap<String, String>
(I'm sure this is mentioned in your docs somewhere?):
reader.lookup::<HashMap<String, String>>(addr)
Doing that works:
Ok({"latitude": "42.48948", "longitude": "-83.14465", "postal_code": "48067", "radius": "500", "country": "US", "region": "Michigan", "network": "50.220.147.113-50.220.147.113", "timezone": "America/Detroit", "city": "Royal Oak", "geoname_id": "5007804"})
So that looks great.
The question now is, how does Trippy know if a given mmdb
file is MaxMind or IpInfo flavoured? Is there some trick to figuring that out? I guess it could try both and see if either works?
I see that the mmdb files have a metadata
attribute which could help tell them apart. Comparing the MaxMind and IpInfo mmdb
files I can see this for the database_type
attribute:
MaxMind (GeoLite2-City.mmdb
):
Metadata { database_type: "GeoLite2-City" }
IpInfo (ip_geolocation_extended_ipv4_sample.mmdb
):
Metadata { database_type: "ipinfo ip_geolocation_extended_ipv4_sample.mmdb" }
So unlike the MaxMind file, the IpInfo file has a database_type
with the format ipinfo <file>
, is that guaranteed to be the case?
@abdullahdevrel I would like Trippy to be able to consume either the free "IP to Country + ASN Database" mmdb
file or the
premium "IP to Geolocation Extended Database" mmdb
file.
One quirk I notice is that the free "IP to Country + ASN Database" mmdb
file has both country
(code) and country_name
fields whereas the premium "IP to Geolocation Extended Database" mmdb
file has only the country
.
From https://ipinfo.io/developers/ip-to-country-asn-database:
FIELD NAME | EXAMPLE | DATA TYPE | DESCRIPTION |
---|---|---|---|
country | JP | TEXT | ISO 3166 country code of the location |
country_name | Japan | TEXT | Name of the country |
From https://ipinfo.io/developers/ip-to-geolocation-extended:
FIELD NAME | EXAMPLE | DATA TYPE | DESCRIPTION |
---|---|---|---|
country | US | TEXT | ISO 3166 country code of the location |
Same story for continent
.
Hey @fujiapple852
My apologies for the late response. I really appreciate you considering our data for Trippy.
Just an FYI, my Rust skill is not very good.
How does Trippy know if a given mmdb file is MaxMind or IpInfo flavoured? Is there some trick to figuring that out? I guess it could try both and see if either works?
That is a very good question. MaxMind uses a nested data structure for their MMDB databases, while IPinfo uses a flat data structure.
MaxMind data structure for MMDB:
IPinfo data structure for MMDB:
As you have seen in MaxMind's MMDB reader library, they have declared the struct
s themselves, so they have native support for their different database. In the case of IPinfo, you have to declare the struct based on database schema, which you have already done in #871.
IPinfo has a flat and predictable data structure. The key will return an empty string even if the value does not exist. And boolean values are strings with true
and ""
(
For Rust, this is usually what I send to users: https://gist.github.com/abdullahdevrel/ace2c80bd53a7323a18bbf8c8ae6a4d2
So unlike the MaxMind file, the IpInfo file has a database_type with the format ipinfo
, is that guaranteed to be the case?
Yes. The database_type
information will be prefaced with ipinfo
.
$ mmdbctl metadata ipinfo_country_asn.mmdb
- Binary Format 2.0
- Database Type ipinfo country_asn.mmdb
- IP Version 6
- Record Size 32
- Node Count 5458524
- Description
en ipinfo country_asn.mmdb
- Languages en
- Build Epoch 1702629871
I think this database_type
value is added when the data is compiled from the CSV file to the MMDB database.
I'm not quite sure what this is for, presumably the token is used for looking up the ipinfo API? I see you have an API for that. Would the token be something that could be bundled in Trippy for all users or just for development use?
The access token is for downloading the IPinfo database. To download the database, users need to run a command like this:
curl -L [https://ipinfo.io/data/free/country_asn.mmdb?token=<ACCESS_TOKEN](https://ipinfo.io/data/free/country_asn.mmdb?token=%3CACCESS_TOKEN)>
Although our API supports 1,000 tokenless requests/day and 50,000 requests/month with a token. Compared to our free IP database, free API does provide city and zip code level information.
I would like Trippy to be able to consume either the free "IP to Country + ASN Database" mmdb file or the premium "IP to Geolocation Extended Database" mmdb file.
We would love if you could use the "IP to Country + ASN Database". It is free and easily accessible for the project and the users, but it does not compromise accuracy at all. Support for this database would be incredible.
Here is the mmdb version of that database: https://www.transfernow.net/dl/20231218MUeQ39J8 (available for 7 days)
One quirk I notice is that the free "IP to Country + ASN Database" mmdb file has both country (code) and country_name fields whereas the premium "IP to Geolocation Extended Database" mmdb file has only the country.
We wanted to make the free IP to Country ASN database as accessible as possible. In our geolocation database, we do not provide the full country name or continent name, and we usually recommend users to use a reference object/dictionary for the full country name, currency, continent, isEu, etc.
For posterity
You have addressed this issue, but admittedly, I have not prepared the best Rust documentation. I am addressing it here in case someone stumbles upon this.
let citydata = reader.lookup::<City<'>>(addr); Fails with Err(DecodingError("invalid type: string \"Royal Oak\", expected struct City"))
This is due to IPinfo not having package native struct declarations. The user has to declare their own structs, and they should not declare a "generic argument" to the lookup
function like <City<'_>>(addr)
.
Perhaps this is what you mean by the data being "flat"? Perhaps I have to deserialise to a custom struct with the "flat" structure? Is there a recommend mmdb reader crate for Rust that support ipinfo flavour of mmdb files?
Yes, this is spot on. The mmdb reader crate does its job of reading the mmdb files perfectly, however, as MaxMind developed this crate they have native support for their database through declaring the structs within the package.
IPinfo and MaxMind's databases are structured differently. So, when using IPinfo's database with mmdb reader crate, users need to declare the structs based on the database schema of the IPinfo database they are using.
Example:
Hi again @abdullahdevrel and thank you for the comprehensive reply!
The key will return an empty string even if the value does not exist
That is good to know, i'll adjust my impl accordingly to treat empty string as None
(I don't think there are any boolean values to worry about here).
For Rust, this is usually what I send to users: https://gist.github.com/abdullahdevrel/ace2c80bd53a7323a18bbf8c8ae6a4d2
As yes, that works well.
and they should not declare a "generic argument" to the lookup function like <City<'_>>(addr).
Nit: note that these are equivalent (the latter infers the type parameter T
from return type of lookup
which much be IpinfoCountryASN
to be assigned to record
):
let record = reader.lookup::<IpinfoCountryASN>(ip_address).unwrap()
let record: IpinfoCountryASN = reader.lookup(ip_address).unwrap()
Yes. The database_type information will be prefaced with ipinfo .
Perfect, that was the key thing I needed to know.
We would love if you could use the "IP to Country + ASN Database". It is free and easily accessible for the project and the users, but it does not compromise accuracy at all. Support for this database would be incredible.
Trippy can certainly support that (it would use the country and continent names from that file, the AS data is not needed as it comes from elsewhere already).
Trippy can also support the extra attributes (city, postcode, lat/log/radius etc) provided by the premium files in a way where Trippy will look for and use these fields if available in the file provided. To put it another way, A user can provide either the "IP to Country + ASN Database" or the "IP to Geolocation Extended Database" mmdb file and Trippy will pick out the data is needs from either. Does that work?
Here is the mmdb version of that database: https://www.transfernow.net/dl/20231218MUeQ39J8 (available for 7 days)
Thank you, I have downloaded the file. Is this the same as the file I can download from https://ipinfo.io/account/data-downloads? (I registered account FujiApple
on ipinfo.io a while ago).
@abdullahdevrel if you could help check the tests I added in #871 then we should be able to merge this.
Merged. This will be included in the 0.10.0 release of Trippy and will be mentioned in the release note.
Thank you very much @fujiapple852!! Really appreciate it!!
I am the DevRel of IPinfo. I would like to request supporting the IPinfo free IP to Country dataset in Trippy. Features of the database:
The database comes in MMDB format, so I believe it can be easily ingested in the project. Also, the data structure is flat and predictable. You can package our free IP to the Country ASN database with the project. For that, we will provide an access token that you can use. By using the IPinfo dataset, you can get both country-level geolocation information and ASN information from a single source.
Please let me know what you what you think. If you need any assistance, please let me know. Thanks.
Schema: https://ipinfo.io/developers/ip-to-country-asn-database