Open abdullahdevrel opened 3 months ago
A Reddit post by redspidr demonstrated the idea of introducing IPinfo's data into Opnsense. Their project converts IP addresses to links to their IPinfo page links, which provide detailed metadata on those IP addresses.
Inspired by it, I have requested the Reddit Opnsense community to review this ticket and recommend bringing our data to Opnsense with the integration of our free IP database first. The current issue does not explore the project demonstrated by the Reddit community user redspidr but a native under-the-hood integration using our database.
A couple of issues were raised by one of the Reddit community users regarding this request, so I am pasting my answers here.
Inclusion of your data rather than the existing provider in the business version?
The free IP database that we have is the best possible variant and is equal if not better than the paid version of the country database the business version of OpnSense is currently using. Consequently, the database is certainly better than the existing free version of the IP database the community version the project uses.
This poses a challenge: how does the Opnsense community benefit most from which action?
Documentation on how to add your data to the free version in parallel to the existing docs?
Considering the previous topic, I am not sure what option would be considered by the project maintainers. If they want to replace their existing database provider they can do that or they can integrate our database in parallel to the existing database.
In terms of documentation, there are slight modifications involved.
Here is a blog post: https://ipinfo.io/blog/migrating-from-maxmind-to-ipinfo
Please let me know if you have further questions. Thank you very much.
For us as a core team this isn't a priority, given our business edition does contain simple to use geo aliases out of the box including a documented file format to use for the community version (https://docs.opnsense.org/manual/aliases.html#geoip).
I do understand that your company will prefer your product above one of its competitors, but there's also some marketing involved in claims being made.
Personally I don't have a strong preference for a geoip vendor, but when it's a commercial discussion, our community GitHub might not be the best place.
If someone does want to do the work, and the amount of required guidance is limited, we will assess in the usual way.
@AdSchellevis
Thank you for reviewing the request. I sincerely appreciate you taking the time to review the issue.
This was not a commercial request, nor am I trying to sell the Opnsense community a comercial service. I advocated bringing highly accurate data designed for open-source projects in mind. The free database is licensed under CC-BY-SA 4.0
I do understand that your company will prefer your product over one of its competitors, but there's also some marketing involved in the claims being made.
I understand the skepticism involved. However, in terms of accuracy, we can provide verifiable information to back up our claim, even for a free database. If you are interested in verifying our claims for accuracy, please let me know. I can walk you through a self-evaluation process that ensures you and the community personally verifying this information.
Personally, I don't have a strong preference for a geoip vendor, but when it's a commercial discussion, our community GitHub might not be the best place.
No, I am not making any form of commercial discussion at all. The proposal was the integration of a free database. There was absolutely no hint of a commercial service. My apologies if I have indicated otherwise. I have tried my best to understand the issues and motivation for selecting the geoip database, and I have seen a ticket where you have mentioned that the software offers the paid IO database through the business version.
However, my proposal was to replace even the paid version of the IP database with a free IP database that we can demonstrate can provide better accuracy.
If someone does want to do the work, and the amount of required guidance is limited, we will assess it in the usual way.
Thank you for considering the issue.
My apologies if I was unclear in saying this is not a commercial service. We built this free IP to Country database primarily to support open source projects. I understand Opnsense is a massive project and the changes required to adopt it may be significant. I can assure you that we can demonstrate the value of adopting the free database to the community and the project's customers.
Would love to see alternatives to Maxmind as well, unfortunately, definitely do not have the time to do the coding at the moment.
Now, there would be a super-easy and fast way to get this available in OPNsense @abdullahdevrel - "simply" provide the data in the CSV format documented and required for OPNsense. 😉
Thanks @doktornotor. I really appreciate that you reviewed the request. This is a significant request, and I understand that it will require engineering commitment to support it. We will let our OPNsense users know that this request is being considered.
Now, there would be a super-easy and fast way to get this available in OPNsense @abdullahdevrel - "simply" provide the data in the CSV format documented and required for OPNsense. 😉
I hope my pitch makes sense when we said our database is simple to use. The current implementation requires 3 CSV files.
While we have all this information in a single file in our IP to Country database:
start_ip | end_ip | country | country_name | continent | continent_name | |
---|---|---|---|---|---|---|
2 | 2620:0:1cff:dead:bef1:100:1:1aa | 2620:0:1cff:dead:bef1:100:1:1b0 | SG | Singapore | AS | Asia |
3 | 212.221.79.153 | 212.221.79.171 | DE | Germany | EU | Europe |
https://github.com/ipinfo/sample-database/tree/main/IP%20to%20Country
If anyone wants to use our data for now, they will have to make modifications to the database on their end.
First of all, I am NOT an OPNsense developer, merely a random code contributor.
If anyone wants to use our data for now, they will have to make modifications to the database on their end.
Well yes, that is the problem. I have been merely hinting the fastest way to get your GeoIP data used in OPNsense - without any coding being required on the OPNsense part (paste in an URL pointing to ZIP file with the required CSV files, done.)
Using a single CSV file might even be easier and faster to process - if someone does the coding, however that's not a drop-in replacement. Not having the IP ranges in CIDR format being one of the examples why the current code won't work and non-trivial amount of coding is required to support this single-file format.
Got it. Thank you.
I am not sure why the project does not use the MMDB file format, which is designed for fast and efficient lookups.
Not having the IP ranges in CIDR format being one of the examples why the current code won't work and non-trivial amount of coding is required to support this single-file format.
We have a tool for that called range2cidr (which also part of our CLI), that can generate the CIDR/range column.
cidr | country | country_name | continent | continent_name |
---|---|---|---|---|
1.0.0.0/25 | AU | Australia | OC | Oceania |
1.0.0.128/26 | AU | Australia | OC | Oceania |
The issue is that the time to generate the CIDR is a bit slow
time ipinfo range2cidr country.csv > country_cidr.csv
real 9m58.852s
user 0m22.040s
sys 0m44.654s
Yeah, a bit. 😉 Same issue with Python, PHP or whatever other code used for the purpose on similar projects. Now, assuming similar HW specs, multiply the wasted CPU time by the user base and one update per day.
As for why MMDB format is not used - the thing is - you are not doing realtime lookups. You simply parse the data once every while and use that parsed data in firewall rules to reject/accept connections. There's no integration for lookups in databases present in the pf firewall code, plus the performance would not exactly rock either I guess.
I will try to think of a solution. The challenge is that we probably won't be able to produce the CIDR variant of the free IP database for download because we would then have to account for maintenance of another variant of the same product.
When MM switched from their legacy geoip to a more modern variant, it broke a lot of things. We aimed to remain stable from day 0 onward to avoid such situations. Introducing a CIDR variant of the database could increase the load on us as we would have to maintain it virtually indefinitely.
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
Is your feature request related to a problem? Please describe.
Using the IPinfo IP to Country ASN or IP to Country database will address several problems with the current IP geolocation implementation:
Describe the solution you like
I am requesting to add support for IPinfo's IP to Country database to the project. The database has the following features:
Database schema
start_ip
end_ip
country
country_name
continent
continent_name
asn
as_name
as_domain
Documentation: https://ipinfo.io/developers/ip-to-country-asn-database
Samples are available here: https://github.com/ipinfo/sample-database/tree/main/IP%20to%20Country%20ASN
The database can be downloaded simply by accessing the storage URI with an access token.
Describe alternatives you considered
A clear and concise description of any alternative solutions or features you considered.
I have not considered an alternative.
Additional context
The business version of OpnSense includes a paid version of the GeoIP country database. However, even though IPinfo's IP to Country database is free, it is the best country-level data available out there because the data source itself is based on latency and networking data-based methodology instead of self-reported locations of ASNs/ISPs.
https://ipinfo.io/accuracy
Additionally, there is no range clustering or delayed updates with IPinfo. IPinfo does not have an accuracy-compromised free country or city database. This database can be considered for the business variant of the software as well and license is permissive to commercial usage.