Improve PA API debugging capabilities

fhoering commented 1 year ago

forDebuggingOnly endpoints are very useful to debug client side Javascript code. It has been recently announced to keep them available in a heavily sampled mode which will be very useful for monitoring.

Due to the limited number of received events (around 5k per day) it still makes sense to be able to have a solution to be able to debug more data. A complementary solution could be to activate those endpoints in Debug mode in a local Chrome browser and randomly explore observed real interest group taggings and real auctions (without knowing the exact user journey as this is not possible anymore).

Integration tests

Here is a documented example on how integration tests could be executed with Chromedriver & Selenium: https://github.com/RTBHOUSE/chromium-fledge-tests

Those tests could be executed on:

a locally spawned webserver (one needs to replicate production server side user data, currently difficult because only endpoints with (fake) HTTPS certificates work
a real hosted staging/production endpoint (no need to replicate user data, HTTPS certificates would already work)

Sample workflow

Some sample workflow to achieve the random user journey could be to:

Query server side logs of visited publisher pages (contextual call for perBuyerSignals) & advertiser pages (tagging call for interest group data)
Create a Chrome driver session

Replay some tagging endpoints for interest group creation

driver = webdriver.Chrome(..)
driver.get("https://adtech.com/create-ig-group")

Current restriction: IG owner will be the endpoint hosted on https://adtech.com, all endpoints like biddingLogicURL must be hosted on this domain

Call a locally hosted test publisher page with real collected perBuyerSignals (as there is currently no constraint that the seller must also own the publisher domain)

const adUrn = await navigator.runAdAuction({
            seller: 'https://publisher.com',
            decisionLogicUrl: 'https://publisher.com/fledge_logic',
            interestGroupBuyers:  ["https://adtech.com"],
            perBuyerSignals: my_collected_buyer_signals
        });

Collect the calls to forDebuggingOnly.reportAdAuctionWin & forDebuggingOnly.reportAdAuctionLoss, in this workflow they will be sent to https://adtech.com. As it is end to end encrypted with HTTPS it is hard to intercept those calls even with some proxy

We would like to collect the calls to forDebuggingOnly.reportAdAuctionWin & forDebuggingOnly.reportAdAuctionLoss locally (console, file, ..) or at least on a webserver running on localhost

As of now we would need to patch the interest group in InterestGroup DB to expose a dedicated bidding script that calls the debug endpoint on https://localhost/ or some hosted server https://adtech1.com with a real certificate

it doesn’t seem to make much sense to expose a dedicated bidding script on domain https://adtech.com that registers a debug endpoint with forDebuggingOnly.reportAdAuctionWin("https://localhost/...)
it is difficult to create a local server with fake SSL certificates or setup reverse proxies (often platform dependent, works differently based on the stack Python, c#, ..)
if we use a new domain for this https://adtech1.com we need to copy back the data to our testing environment

Chromium debug features that could potentially unlock or simplify this use case

Be able register interest groups under http://localhost, see https://bugs.chromium.org/p/chromium/issues/detail?id=1208187
Be able to patch only some endpoints of the interest group without owner domain restrictions, currently I can patch interest groups already by doing a SQL query to Chrome’s local DB but then if I update the biddingLogicURL on a domain like https://localhost/my_bidding_logic it will not work because domain must be the same as the IG owner domain (https://adtech.com)
Currently by design there is little feedback from the auction which makes automatic tests complicated, one could for example allow output to the Chrome console from inside the bidding script for easier debugging and to be able to create some output and assert that some conditions hold true

morlovich commented 1 year ago

FWIW, the flags that can be used to develop web platform tests with Chrome can be of some use here; that setup uses something like: --ignore-certificate-errors --host-resolver-rules="MAP nonexistent..test ^NOTFOUND, MAP .test. 127.0.0.1, MAP *.test 127.0.0.1"

You can use these sort of resolver rules w/the appropriate hostnames rather than *.test to direct adtech.com towards 127.0.0.1, and then --ignore-certificate-errors will make Chrome not care about the certificate.

michaelkleber commented 1 year ago

Fabian, it sounds like Maks's comment about --ignore-certificate-errors and --host-resolver-rules will solve your domain name and TLS certificate problems, right?

Beyond that, are forDebuggingOnly.reportAdAuctionWin and forDebuggingOnly.reportAdAuctionLoss sufficient for the rest of your needs? We can certainly leave a debugging flag in place to enable those APIs to send their reports all the time, even once they are downsampled in the real world.

michaelkleber commented 1 year ago

Oh, also we do think we "allow output to the Chrome console from inside the bidding script"! Please let us know if you find that's not working.

fhoering commented 1 year ago

@morlovich Many thanks for those flags. We will definitely try those out. It would be nice if it can unlock some of our tests.

@michaelkleber

Fabian, it sounds like Maks's comment about --ignore-certificate-errors and --host-resolver-rules will solve your domain name and TLS certificate problems, right?

yes it should normally

Beyond that, are forDebuggingOnly.reportAdAuctionWin and forDebuggingOnly.reportAdAuctionLoss sufficient for the rest of your needs? We can certainly leave a debugging flag in place to enable those APIs to send their reports all the time, even once they are downsampled in the real world.

yes it could be sufficient. It would be nice also to make an exception for IG domain checks on localhost such that we can patch the bidding script in the local InterestGroups DB as described and override the urls for forDebuggingOnly.reportAdAuctionWin.

Oh, also we do think we "allow output to the Chrome console from inside the bidding script"! Please let us know if you find that's not working.

OK. Indeed. I wasn't aware console logs already work in Chrome.

Actually it seems like it only doesn't work in Chromedriver and as this ticket is about integration tests it is still relevant.

This has already been reported here: https://github.com/WICG/turtledove/issues/499

Furthermore, we have noticed that console.log from seller and buyer scripts is missing from the selenium webdriver logs. This is a crucial feature that we previously had access to when debugging regression tests , and we would greatly appreciate it if it could be restored.

JacobGo commented 1 year ago

Chiming in from Google Ads testing perspective, you may find it helpful to route all Chrome traffic to a proxy that can serve local or faked versions of the different servers, or leak to production if complete hermeticism isn't strictly necessary.

+1 to the ask for console.log within worklet to route through Chromedriver logging, similar to console.log in normal JS contexts. This remains an issue and hampers print-style debugging flows that start with a failing test, although DevTools debugger flows are a viable and much appreciated alternative here.

Also +1 to a debugging flag to remove downsampling of forDebuggingOnly once https://github.com/WICG/turtledove/issues/632 goes into effect. We have built infrastructure to log these in the test environment and even assert on their shape to obtain more granular assertions on auction outcomes across different scenarios.

morlovich commented 1 year ago

Hmm. So I suspect the reason console.log may have stopped working for you is because I removed our hand-rolled minimal implementation in M100 (hit stable Tue, Mar 29, 2022 --- does the timing feel vaguely right), and now it works exactly the same as console.log does everywhere (using V8's devtools support). Maybe Chromedriver doesn't hook that up for worklet targets, though. It does seem to be doing stuff based on it, at least: https://source.chromium.org/chromium/chromium/src/+/main:chrome/test/chromedriver/chrome/console_logger.cc;drc=7172fffc3c545134d5c88af8ab07b04fcb1d628e;l=53

Do you have to do anything special to get the console.log stuff from the top-level?

fhoering commented 11 months ago

@morlovich I tried applying --ignore-certificate-errors

It bypasses the checks to require a valid certificate and CA checks indeed. However I still need to deploy my server with a TLS certificate and call a HTTPS domain.

WARNING test__.py:54 {'level': 'SEVERE', 'message': "http://localhost:8101/ 4:12 Uncaught DOMException: Failed to execute 'joinAdInterestGroup' on 'Naviga…ay only joinAdInterestGroup from an https origin.", 'source': 'javascript', 'timestamp': 1701776797026}

Is it complicated on your side to solve this to allow localhost ? (issue is followed here also https://bugs.chromium.org/p/chromium/issues/detail?id=1208187)

fhoering commented 11 months ago

Do you have to do anything special to get the console.log stuff from the top-level?

No I start chromedriver with --enable-chrome-logs and then get them with: driver.get_log('browser')

https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webdriver.WebDriver.get_log

fhoering commented 7 months ago

@morlovich Any news on not seeing console.log in Chromedriver ?

The issue is still there and it is not viable to set up manual test pages in Chrome for everything and debug everything with Devtools. At some point when complexity increases automatic tests need to be done.

morlovich commented 7 months ago

Thanks for the reminder. I spent some time debugging this, and I think it just doesn't include console.log due to default browser logging setting being warning or higher.

So possible solutions are: 1) Use console.warning 2) Configure chromedriver that you want to include at least info-level logging for browser log. I don't know what the official way of doing this, but the random stackoverflow hit of: options.set_capability('goog:loggingPrefs', {'browser': 'ALL'}) ... does appear to work.

(It's possible that for other uses it somehow gets picked via roundabout way despite the log level).

fhoering commented 7 months ago

@morlovich Both of the options that you suggest don't work.

I was actually already running my scripts with the setting

  options.set_capability('goog:loggingPrefs', {'browser': 'ALL', 'performance': 'ALL'})

The issue is really that the logs (whatever their level) from inside the worklets don't show up in Chromedriver. Logs from the normal JS context, like what happens before and after runAdAuction are visible.

morlovich commented 7 months ago

Oh. I think I spent a bunch time debugging why a console-api message wasn't showing up while not realizing it was coming from the main page rather than the worklet. My apologies, I'll get back to you in a bit.

morlovich commented 7 months ago

OK, I can confirm the problem and I understand why it's happening (basically ChromeDriver isn't paying attention to auction_worklet targets at all, though it does get attached/detached to/from them); sadly not a straightforward thing to fix, but also not something fundamentally difficult --- it's just not somethings things are well setup to handle.

fhoering commented 7 months ago

OK. Thanks for the investigation. Do you think you can prioritize this issue on your side ? Today the auction is a big black box. Every feedback from inside for debugging could be useful I think. And it seems important also to get things working consistently across Chrome and ChromeDriver.

JacobGo commented 7 months ago

@morlovich I was curious if you were aware of another bug in DevTools > Console from PA script runners, where (unserialized) JS objects are logged normally but are frozen/unexpandable once the auction concludes and script runners are disposed. This makes it a fairly inconvenient to log and explore browserSignals in DevTools, for example.

WICG / turtledove