ooni / probe-engine

Semi-automatic export of https://github.com/ooni/probe-cli internals
https://ooni.org
GNU General Public License v3.0
45 stars 16 forks source link

Manage MaxMindDB license change #269

Closed bassosimone closed 3 years ago

bassosimone commented 4 years ago

This is the master issue for managing the MaxMindDB license change. Child issues:

On creating this issue, I've labeled this as an epic, since it seems very large stuff to me.

hellais commented 4 years ago

More on this was discussed during the OONI monthly community meeting. Here are some notes from it:

So basically Maxmind, completely out of the blue, made some significant changes to the license of their GeoIP database. See: https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/ This is the thing we use in OONI Probe for transforming an IP address in an ASN and Country Code locally, so that we never have to learn about users IP addresses However, since we are shipping this database, in light of their new license agreement, we would have to make substantial changes to our app, where the primary one is that we would have to force users to agree to a terms of service agreement which includes the legalese from Maxmind

Which includes stuff like: You shall cease use of and destroy (i) any old versions of the Services within thirty (30) days following the release of the updated GeoLite2 Databases You will maintain reasonable and appropriate technical and organizational measures for the protection of the security, confidentiality, and integrity of the Services You are responsible for the acts or omissions of any third parties with which you share the Services Anyways it’s basically a pretty annoying situation which we would very much like to avoid

We have consulted with a lawyer last week with @sbs to get some advice on how to proceed It seems like our best option is probably to stop doing the geolocation on the device of the user, but rather doing it on some other service This would mean that the threat model slightly changes, because we now for some amount of time know the IP of the users (rather than using some random IP service to discover it and then do the resolution locally)

This came up too:

"Matomo will now use db-ip.com as a geolocation provider..." https://matomo.org/changelog/matomo-3-13-1/

This too:

if it's useful for the user to know it's country locally, you could stop-gap with something like https://github.com/willscott/ip2country That one directly generates a database from BGP announcements. Not as accurate in all cases, but will do reasonably well for most end-user ISPs, is my understanding

hellais commented 4 years ago

We should evaluate ip2country and db-ip.com. Perhaps we can use them to generate a compatible database.

tomac4t commented 4 years ago

There is a comment about DB-IP on https://github.com/matomo-org/matomo/issues/15308, said there doesn't contain IPv6 addresses on the free database, but I didn't find related information on their FAQ page:

Per their FAQ page at https://db-ip.com/faq.php , the free DB-IP city database does not include IPv6 addresses.

hellais commented 4 years ago

I created a new issue and put it in this epic, for the next step to do during the next sprint: https://github.com/ooni/probe-engine/issues/306.

bassosimone commented 4 years ago

@tomac4t Thank you so much for checking! I also cannot find word regarding IPv6. Also:

$ mmdblookup --file dbip-country-lite-2020-02.mmdb --ip 2a00:1450:4002:804::200e 

  {
    "continent": 
      {
        "code": 
          "EU" <utf8_string>
        "geoname_id": 
          6255148 <uint32>
        "names": 
          {
            "de": 
              "Europa" <utf8_string>
            "en": 
              "Europe" <utf8_string>
            "es": 
              "Europa" <utf8_string>
            "fa": 
              " اروپا" <utf8_string>
            "fr": 
              "Europe" <utf8_string>
            "ja": 
              "ヨーロッパ大陸" <utf8_string>
            "ko": 
              "유럽" <utf8_string>
            "pt-BR": 
              "Europa" <utf8_string>
            "ru": 
              "Европа" <utf8_string>
            "zh-CN": 
              "欧洲" <utf8_string>
          }
      }
    "country": 
      {
        "geoname_id": 
          2963597 <uint32>
        "is_in_european_union": 
          true <boolean>
        "iso_code": 
          "IE" <utf8_string>
        "names": 
          {
            "de": 
              "Irland" <utf8_string>
            "en": 
              "Ireland" <utf8_string>
            "es": 
              "Irlanda" <utf8_string>
            "fa": 
              "ایرلند" <utf8_string>
            "fr": 
              "Irlande" <utf8_string>
            "ja": 
              "アイルランド" <utf8_string>
            "ko": 
              "아일랜드" <utf8_string>
            "pt-BR": 
              "Irlanda" <utf8_string>
            "ru": 
              "Ирландия" <utf8_string>
            "zh-CN": 
              "爱尔兰" <utf8_string>
          }
      }
  }

So, I believe there is IPv6 now in the free db-ip.com database!

tomac4t commented 4 years ago

http://server-nexa.polito.it/pipermail/nexa/2020-January/016629.html

  1. se non possiamo fare questo, possiamo creare un servizio web liberamente accessibile cui possiamo fare collegare le nostre app per ottenere la geolocazione?
  2. if we cannot do this, can we create a freely accessible web service to which we can connect our apps to obtain geolocation?

Sounds good. Then OONI website delete the old version GeoLite2 and upgrade to new one every 30 days should be enough? Maybe should ask support@maxmind.com...


I set up a service and look good to me. ip, country_code, asn ~and organization~ are necessary to OONI Probe.

2020-02-23_11-18

bassosimone commented 4 years ago

We further discussed @tomac4t proposal on Slack. Here's the relevant excerpt:

@bassosimone: we are already quite dependent on the probe interacting with the backend (think, e.g., at the Web Connectivity test helper), so it seems to me we can quite easily argue, in this vein, that a service doing geolocation is better for users b/c they don't need to download ~7 MiB of database

@hellais: Yeah I think this makes sense.

So, we're quite likely going to move in such direction.

bassosimone commented 4 years ago

Even if we implement ASN and country lookup for the probe in the backend, we also want to annotate measurements (and specifically DNS lookups) using their ASN. For this reason, we want, at least, to generate and ship to users our ASN database; see https://github.com/ooni/probe-engine/issues/620.

bassosimone commented 4 years ago

This release of probe-assets https://github.com/ooni/probe-assets/releases/tag/20200529153246 includes an experimental DB that we generated. We are going to be using and testing it with the apps. Our tests show that this DB has a reasonable accuracy. We will continue testing this solution in the coming sprints. See https://github.com/ooni/probe-engine/issues/620 for recap on progress.

bassosimone commented 4 years ago

So, we're now in a better position where we clearly have a reasonable replacement for asn.mmdb. I believe we are not going to fully ditch such database, because that implies a stronger dependency on OONI probe services, which may not be what we want in cases in which significant parts of the network are down and we still want to perform measurements and obtain ASNs from the IP addresses included into the measurement themselves. I think we should actually look deeper into this plan as part of Sprint 21 and see whether we can create clear follow-up issues and finally close this now-lingering epic.

bassosimone commented 4 years ago

It's confirmed, we will try to complete this Epic in this season^W^W^W sprint. This should be the final episode^W^W^W issue: https://github.com/ooni/probe-engine/issues/727

bassosimone commented 4 years ago

(No need to explicitly keep this issue in Sprint 21.)

bassosimone commented 3 years ago

When Go 1.16 is out, we will change the build such that it embeds the databases. We will continue discussing alternative ways of organizing the databases, including a backend service. In any case, I think this issue has gone beyond its original scope and we can close it now. The Go 1.16 embedding strategy will mitigate many issues related with the databases anyway.