microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
66.21k stars 3.62k forks source link

[Question] The browser rendering two versions of the same page source code #5793

Closed ghost closed 3 years ago

ghost commented 3 years ago

I have an issue for scrapping Google Maps. The thing is that I launch my browser (same issue with webkit, chromium or firefox) using browser.newContext(). And however in a way that seems very random the source code of the page https://www.google.fr/maps/search/cars/@48.868987,2.3106734,13z is not the same (two versions). It is the page showing all the places resulting from a query. I take an example for retrieving the informations of every place. One version of the container is with tag like this: <div jsaction="mouseover:pane.wfvdle12;mouseout:pane.wfvdle12" jstcache="125" class="sJKr7qpXOXd__result-container sJKr7qpXOXd__wide-margin sJKr7qpXOXd__has-image" jsan="7.sJKr7qpXOXd__result-container,7.sJKr7qpXOXd__wide-margin,7.sJKr7qpXOXd__has-image,0.jsaction"><a aria-label="CAR SALE LUXURY" jsaction="pane.wfvdle12;focus:pane.wfvdle12;blur:pane.wfvdle12;auxclick:pane.wfvdle12;contextmenu:pane.wfvdle12;keydown:pane.wfvdle12;clickmod:pane.wfvdle12" jstcache="126" href="https://www.google.fr/maps/place/CAR+SALE+LUXURY/data=!4m5!3m4!1s0x47e66e24d66d0f1b:0x2dca3616a6aa8fcf!8m2!3d48.8660871!4d2.3369379?authuser=0&amp;hl=fr&amp;rclk=1" class="place-result-container-place-link" jsan="7.place-result-container-place-link,0.aria-label,8.href,0.jsaction">

And an other version is like this: <div class="section-result-content"> <div jstcache="751" class="section-result-text-content" jsan="7.section-result-text-content"> <div class="section-result-header-container"> <div jstcache="752" class="section-result-header" jsan="7.section-result-header"> <div jstcache="753" class="section-result-partial-result-justification" style="display:none">

I would like to force the browser to render everytime the same version.

yury-s commented 3 years ago

Looks like the server returns you different results depending on where you are coming from. I don't think you can control this from the client as e.g. two different clients may get pages with different experiments enabled in them.