openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
GNU Affero General Public License v3.0
615 stars 358 forks source link

Do not use ip to initialize country for products created through the API but not directly by the client #1706

Open stephanegigandet opened 5 years ago

stephanegigandet commented 5 years ago

Products created on the web site or through the API have a country assigned by default, based on the IP address. This does not work when the API is accessed by a server (e.g. on Amazon) instead of actual users (apps that send API calls directly from the phones).

Current code in Products.pm:

use ProductOpener::GeoIP;
my $country = ProductOpener::GeoIP::get_country_for_ip(remote_addr());

# ugly fix: products added by yuka should have country france, regardless of the server ip
if ($creator eq 'kiliweb') {
    $country = "france";
}

The ugly fix won't work soon since Yuka is getting into other countries.

Possible solutions:

  1. Add a special field in the API to specify that API requests come directly from clients. (we could at least use it in the OFF app).
  2. Add a special field in the API to specify that requests do not come directly. --> may not be set by apps using the API even if they proxy API requests
  3. Maintain a list of users for which we don't try to assign a country automatically (they can still send one using the add_country field).

Thoughts?

hangy commented 5 years ago

We could see if kiliweb's and other external APIs' requests originate from IPs that GeoLite2 would recognize as [is_hosting_provider()](https://metacpan.org/pod/GeoIP2::Model::AnonymousIP#$anon-%3Eis_hosting_provider()). (If that data is in the free GeoLite2 DB, I'm not sure!)

Alternatively, we could also limit IP detection to specific User-Agent value. If the user-agent is recognized as a normal browser (ie ^Mozilla [45]\.0.), then perform geolocation; Otherwise don't. That would include most browsers and in-app browsers and exclude a bunch of libraries. It's not perfect, but geolocation is inaccurate anyways …

CharlesNepote commented 5 years ago

There are certainly some cases where some users use a proxy with the app. For example, if I use the mobile app via a public or personal Wifi which use a VPN. Thus, solution 1 doesn't guaranty that the product is scanned from my country.

Why not add a new field "country_unique_ip" which count how many unique IP addresses ask for a product in each country. An example of value: "be:148,fr:2,nl:5" => this product may have been created by an IP from the Netherlands (where there are many VPN companies) but is clearly related to Belgium. When the number of unique IP is more than 5 or 10 for a country, the "countries_en" field is filled with this country.

syl10100 commented 5 years ago

Yuka (and other app) could give the country information when they call OFF api. They should know from which country is called their own app, no ?