Detecting fraudulent engagement

miketaylr commented 1 year ago

Originally filed at https://github.com/spanicker/ip-blindness/issues/15 by @spanicker.

Services that are embedded in a third-party context will now see distinct IPs for each top-level domain that the user is visiting. This negatively impacts the ability to count the number of distinct users across a set of sites, and makes it easier to inflate impressions and ad clicks by having these same users engage on multiple sites.

Some attributes, such as GeoIP, may allow sites to validate observed regional distributions against what is expected. We are keen to discuss any suggestions that could improve defensibility within our privacy objective of preventing scaled cross-site tracking.

Note that there's already some conversation w/ @dmdabbs and @etrouton in the old issue, but let's continue discussion here.

eyeoneder commented 1 year ago

Hi Mike,

I work with many clients across many sectors that run hundreds of Google Adwords campaigns. Between them they currently spend hundreds of thousands of dollars per month.

We have spent a great deal of time working on strategies to reduce their unnecessary spend by using the IP address exclusion feature within Adwords. By monitoring individual IP address activity for each campaign and those IP addresses clicking on keywords 10+ times per day, we have been able to exclude these IP addresses. The list of excluded IP addresses is updated every week.

The result is that we have maintained the performance of each campaign but significantly reduced the monthly spend for our clients. If we do not have access to granular IP addresses, we will no longer be able to help our clients save money, this will lead to reduced campaign performance and will no doubt lead to us turning off a number of campaigns as the economics will no longer work. It also feels like Google will be making more money from our clients by removing access to IP addresses. We are aware of Adwords fraudulent click solution, but it doesn't have the same impact of the IP address exclusion feature.

joeytPX commented 11 months ago

Hi Mike and team, I'm from HUMAN Security, working on the detection team for our media products.

Some of this has already been publicly documented or otherwise discussed in industry forums like TPAC/W3C working groups, but I'd like to highlight a driving example of the critical role IP address plays in detection of serious fraud: in this case the identification of the 3ve botnet and associated ad fraud scheme, the largest multilateral takedown of an ad-fraud oriented botnet scheme to date.

The attack worked by compromising hundreds of thousands of victim computers worldwide with a variant of the Kovter botnet. Each infected device ran a custom-built headless version of Google Chrome alongside associated bot software to ‘drive’ the browser. The browsers would then load thousands of ad-supported web pages from a broad variety of domains across the web and - either directly or indirectly - generate illegitimate ad revenue for the website owners (who were often part of the scheme). This activity generated at least $15M USD of fraudulent profits for the scheme's operators, with losses likely multiple times this amount to organisations worldwide that buy digital advertising.

IP address use was critical for the organisations involved in defending against the scheme in multiple ways:

identifying devices on the internet that were currently infected with this malware and defrauding advertisers.
derivation of more granular real-time rulesets to prevent further fraudulent ads being shown, limiting the financial damage caused by the scheme.
the use as shareable indicators of compromise (IOCs) for use in targeted information sharing between participants in the cross-industry working group as well as law enforcement. This greatly accelerated the discovery process, which ultimately allowed it to be disrupted sooner.

I also believe IP addresses were pertinent to legal details of the 2021 trial and conviction in New York City of the orchestrator of the scheme, Alexander Zhukov.

sollyucko commented 10 months ago

@eyeoneder

By monitoring individual IP address activity for each campaign and those IP addresses clicking on keywords 10+ times per day, we have been able to exclude these IP addresses.

Echoing a comment by tiggity at https://forums.theregister.com/forum/all/2023/11/11/google_proxy_plan_cma/#c_4758481:

In many a large company, individual IP address visible internally, go "external" , and just the one IP address (or maybe a handful depending on company size) "exposed" for all employees. SO that marketeer with their 10 click logic would thus have a good chance of excluding the proxy IP address of lots of big companies (in cases where big companies allow a bit of personal browsing e.g. on a lunchbreak or for information research purposes ).

Also note that this may also be the case for residential IP addresses due to Carrier Grade Network Address Translation (CGNAT).

humbertoby8212 commented 10 months ago

Hi @miketaylr

This article was recently posted on The Register: https://www.theregister.com/2023/11/11/google_proxy_plan_cma/

There is a comment at the end from Google that states:

"Critics claiming that our IP Protection proposal is self-preferential to Google are either knowingly misrepresenting the facts, or simply don’t understand what is being proposed. The IP Protection proposal includes a two-hop proxy system, with one proxy being operated by a third-party."

@eyeoneder has written above that by Google implementing IP Protection that their solution will no longer be able to capture IP addresses and exclude appropriate IP addresses from their Adwords campaign. This will drive up their Google Adwords spend.

How is it not self-preferential when Google is implementing a system that will ultimately cause people to spend more money with Google on Adwords once IP Protection is launched???

movementforanopenweb commented 10 months ago

Further to @humbertoby8212's observation we stand by our original comment which was originally posted on X.

Chrome is a software product controlled by Google and has access to IP addresses and user destinations. This is an engineering fact. Chrome would not work otherwise.

The two-hop proxy system is not relevant. Google via their control of Chrome WILL be able to “see both the client IP address and user destination”.

Perhaps Google have organizational measures in place to prevent the IP address and user destination known to Chrome becoming known to other parts of Google. However how would users or competitors know or confirm this? Why would such a solution be acceptable when applied by Google, but not apparently when used by others?!

We do know from trial documents that Chrome in Incognito mode, where the user might consider that IP is not used by Google, does in fact make use of this (see snippet picture).

We would prefer Google contribute to internet features to verify all uses of IP address, including by themselves, such that users and competitors could have some confidence the promises they make are being respected. This would be a genuine privacy improvement.

Alternatively Google could cease control of Chrome by putting the entire Chrome product in a completely separate organization and committing not to operate a web browser or operating system. See our remedies summary.

GoogleChrome / ip-protection

Detecting fraudulent engagement #4