Open bretg opened 10 months ago
@bretg in an effort to keep things as simple as possible, I'm wondering if we need the host config geolocation.enabled
. If a host company does not want to enable geolocation they should be able to do so by omitting settings.geo-lookup
from their account configs and just leaning on the account default settings.geo-lookup
which defaults to false.
I guess this could be of value though if you want the ability to globally toggle it, perhaps for testing purposes? Is there another scenario I'm missing where you envision this being of value?
From an optimization perspective, this host config isn't needed in PBS-Go since our discussion earlier today at the PMC led to a requirements change where the lookup happens before the raw auction stage instead of before the entrypoint stage. With this requirements change, PBS-Go should be able to take advantage of the existing account fetching logic instead of having to perform geo lookup specific parsing to extract the account ID and fetch the account object.
wondering if we need the host config geolocation.enabled.
This is an existing config. The use case for keeping it would be as a master kill switch for geo lookup in case there's an issue with the geo lookup servers.
I'm willing to consider removing it in PBS 3.0, but it would be a breaking change to remove it now. I think it would be fine to not implement in PBS-Go if it doesn't exist there now.
With this requirements change, PBS-Go should be able to take advantage of the existing account fetching logic
There are scenarios where the account ID isn't available until after reading the stored requests.
I just wanted to throw another idea here. It's fare to assume that most hosting companies do use a cloud provider or at least some sort of loadbalancing. Especially if your are acting globally, you probably do load balance on the geolocation of the user. Here's a short list of commonly used technologies for load balancing around geo
In the end it boils down to two sorts of load balancing
Both variants can be used to make the geo information available directly in the request, without the need for geo location lookup.
Example: https://cloud.google.com/load-balancing/docs/https/custom-headers?hl=en
If host companies use an application load balancers, it can add HTTP headers that contain the geo location. It's just a matter of reading those from the HTTP request.
From my minimal knowledge, application load balancers are more expensive, hence network load balancers are preferred if possible.
If the requests is re-routed to another IP address, there's no possibility to append information. However it would be possible to statically provide the information to the running instance in which geo it is running. This would at least provide the continent.
We could extend the geo location lookup to a multi step process, where every step can be enabled disabled. For example
geolocation:
enabled: true
# define the order in which the geo location should be determined
lookup:
- cloudfront-header # checks if there's a http header from cloudfront
- maxmind # check a geo database if available
- static # use a statically provided value
# configure all the rest
This is a super rough sketch, just to transport the idea.
Interesting thought @muuki88 , but I don't think this is going to be possible with DNS-based geo-balancers like Akamai's GTM... there's no 'edge' in that case for 'cloudlets' to work in or attach headers to.
Here's a counter-proposal:
I want to add a proposal to add a sampling in addition to the feature toggle. So only a certain configurable % of requests for any given account will have geo lookup happen early. The idea here is to have finer control when the host company has one dominant account that generates most of the traffic. Although this is more of an operational feature.
done wit PBS-Java 2.13
There are several use cases for having Prebid Server do geographic lookups:
Currently only PBS-Java does geo-lookups, and only for GDPR scope as described here. The lookup is only called when no other signals indicate GDPR scope and when the account wants PBS to enforce GDPR.
The problem is that geo-lookups have a cost in both latency and money, so there should be controls for the host company to manage the volume.
The proposal is that there should be account-level config that will cause it to do geo-lookups early in the workflow in support of the above use cases.
geolocation.enabled
is false, don't do lookup. Default is false.settings.geo-lookup
is true (defaults to false) and if request $.device.geo.country is not specified, then PBS should do the lookup and set device.geo.country to an ISO-3166-1-alpha-3 code and device.geo.region to ISO-3166-2; 2-letter state code if USA.settings.geo-lookup
does not change or disable the ability for PBS to determine GDPR scope per the flowchart. The GDPR lookup feature is disabled if the overallgeolocation.enabled
is false.