philhagen / sof-elk

Configuration files for the SOF-ELK VM
GNU General Public License v3.0
1.47k stars 276 forks source link

[Feature Request] Support for IPinfo IP to Country ASN database #288

Closed abdullahdevrel closed 1 year ago

abdullahdevrel commented 1 year ago

Requesting integrating IPinfo's free IP to Country ASN database. Features of the database:

Although the database does not provide city-level data, it is not a reduced-accuracy version database but rather a limited yet fully accurate IP Geolocation database. The database provides granular data records for each IP address and does not aggregate ranges.

Please let me know what you what you think. If you need any assistance, please let me know. Thanks.

Schema: https://ipinfo.io/developers/ip-to-country-asn-database

FIELD NAME EXAMPLE DATA TYPE DESCRIPTION
start_ip 1.0.16.0 TEXT Starting IP address of an IP address range
end_ip 1.0.31.255 TEXT Ending IP address of an IP address range
country JP TEXT ISO 3166 country code of the location
country_name Japan TEXT Name of the country
continent AS TEXT Continent code of the country
continent_name Asia TEXT Name of the continent
asn AS2519 TEXT Autonomous System Number
as_name ARTERIA Networks Corporation TEXT Name of the AS (Autonomous System) organization
as_domain arteria-net.com TEXT Official domain or website of the AS organization
philhagen commented 1 year ago

Hello and thanks for filling this issue! The whole MaxMind situation has led to me putting a lot of thought into how to move forward. Right now, we are including the final "distributable license" databases with the VM, and they are also installed via the Ansible build process. I've kept an eye on competitors, including IPInfo.

Right now, I'm planning to keep the MaxMind solution in place for a few reasons:

That all said, I'll admit that if we had to leave MaxMind today, IPInfo would be the odds-on choice for all the reasons you describe. I'll close this for now, but will circle back to this issue if we move to replace or provide a choice between the two libraries in the future.

abdullahdevrel commented 1 year ago

@philhagen I really appreciate you taking a look. I have to admit that the engineering investment required to support or even migrate to us is non-trivial, but I thought I would reach out to you.

If you ever need any help with our data, we are happy to assist you. We have an active community for supporting OSS projects. By the way, if you are exploring the MMDB databases, feel free to check out our MMDBctl tool or even Kaggle where we are hosting our data.

Again, thank you very much for reviewing the proposal. Feel free to ping us anytime.

philhagen commented 1 year ago

Certainly - I think we're directionally aligned for sure. IMO, this is a "when" not an "if" thing - but I'll need to allocate adequate time to make sure it's as seamless as possible for users - while not incurring a "just distribute two different VMs" workload on me :). That said, I have some ideas swirling around the back of my head and will try to follow up on them when I clear a few major projects that are nearing completion.

One thing that could help is if you were able to confirm or validate whether the field names are the same between the MaxMind and IPInfo databases - e.g., that simple drop-in replacement files are a viable path to adoption.

abdullahdevrel commented 1 year ago

I am really glad to hear that.

One thing that could help is if you were able to confirm or validate whether the field names are the same between the MaxMind and IPInfo databases - e.g., that simple drop-in replacement files are a viable path to adoption.

Even though we offer an MMDB data structure, our data structures are a bit different. MaxMind's DBs are nested and ours is flat. Moreover, I believe their database is not consistent, which means that you have to try and catch statements to first see if they have the key and then the value.

If you want to look up the country information of an IP address in MaxMind, you have to write deep indexing queries like this response['country']['iso_code'], and you need to wrap this query in try and catch statement to avoid indexing errors. In our case it is, response['country'], if the data does not exist, we just return an empty string.

MaxMind's data image

IPinfo's data image