microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
64.91k stars 3.53k forks source link

[BUG] code using CSS pseudo elements not showing (particularly ::before and ::after) #14390

Closed LikeMoneySecretly closed 2 years ago

LikeMoneySecretly commented 2 years ago

Context:

Python Code Snippet:

from playwright.sync_api import sync_playwright
import time

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False, slow_mo=10,channel="msedge")
    page = browser.new_page()
    page.goto("https://www.bet365.com/#/HO/")
    print("This is the page title:"+page.title())
    page.wait_for_timeout(100000)
    browser.close()

Gyazo Links to the problem website code below: [https://gyazo.com/25c80802004a43048dc9085144afa6ae](With Playwright opening edge) [https://gyazo.com/05df109591e8ec8e29b6cc8fa530a608](Opened manually in edge)

Basically the issue is that code that relies on ::after is not showing up in the playwright version of the website

This is repeatable in chromium as well i just did it in edge for ease. My initial suspicion was that some of the anti botting software we use might be doing this, However this doesn't make sense as our typical response to botting is much stronger than simply making small parts of the website not show up. Therefore my impression was that it might actually be an issue with playwrights website interpretation, in particular that CSS dependant code is not getting interpreted as it should.

yury-s commented 2 years ago

Please provide a reduced scenario which we can reproduce locally or at lease the exact description of what element (selector?) you expect to see and which doesn't appear when the browser is controlled by playwright. Otherwise this is a large live page and it's unclear what to look for.

LikeMoneySecretly commented 2 years ago

For example https://gyazo.com/a21c7f0049794c858c15f26ec033113b ,

LikeMoneySecretly commented 2 years ago

I suppose the best way to describe it is that there are missing buttons and containers with variables that regularly change (see above for an example of one of those classes).

is it possible that playwright is only taking a Snapshot of a webpage and that it’s therefore failing to grab any information that changes relatively quickly after that initial snapshot.

The reason I think that is that when opened in msedge with playwright it shows a blank area where those buttons and containers normally are.

EDIT: I.E. Regularly (like every 5 or so seconds (typically less) all of these containers and buttons are updated, they aren't loaded to begin with on the normal version of the website but are added later and then updated constantly however they are not appearing in the playwright version at all, and aren't even getting updated in.

LikeMoneySecretly commented 2 years ago

EDIT: Previous response written in this comment was incorrect, this photo https://gyazo.com/5742612407337c7808736d11e2665dee is actually far more accurate. On the left is the manual microsoft edge and on the right is the playwright microsoft edge. All the orange class names don't change between the two versions however the red class names are different (I.e. the playwright one is different to the manual one) The Green class is the wrapper to the red classes.

As I said above it looks like playwright is not keeping up with the javascript in the webpage, either the web page isn't giving that javascript to playwright or playwright isn't displaying it. Is there any way to tell which is which?

You can also see the excess CSS pseudo elements on the right hand side (playwright version), a blatant example being the excess ::before at the start of the playwright version

pavelfeldman commented 2 years ago

There is no Playwright Edge, the same Microsoft Edge is running in both cases. We can't tell much by the DOM, can you observe the visual difference between actual pages? Could be different page, different point in time, etc. etc.

LikeMoneySecretly commented 2 years ago

Nah there is a clear visual difference, it’s also an issue that seems to happen in Puppeteer as well. When I ran the same test in puppeteer the identical issue happened. And no I copied the identical link between the two pages the playwright side is visually different and is clearly missing containers and buttons in the box area where they should be.

So to summarise, Same link, if both are refreshed from the same link at roughly the same time a difference still appears (missing containers).

In regards to a point of time difference is there a way to record the internal playwright time/find out a difference between the way time works between the two browsers?

The time difference is really the only thing I can think that would carry between puppeteer and playwrighht

LikeMoneySecretly commented 2 years ago

I don’t suppose using page.wait_for_timeout(10000) is causing this issue? I use time.sleep(140) in the puppeteer version as well, just to enable me to keep the non-headless browser open long enough to check all the visual code through inspect element

Then again when refreshing the page in the middle of the timeout it doesn’t appear, unless of course the page is instantly frozen after the refresh again and the playwright browser has no time to update

If it is the issue it suggests that playwright isn’t updating the browser in a rest mode/dynamically changing websites aren’t being picked up. Is there a way to dynamically wait in playwright as such that the browser isn’t in an effective frozen state?

LikeMoneySecretly commented 2 years ago

I tried using a system which waited for the specified Selectors to appear in the DOM before the browser quit however that did not work either. I think it might be an issue with the way playwright is handling waiting (something that is carried over from puppeteer maybe)

LikeMoneySecretly commented 2 years ago

I managed to fix it by using Puppeteer and connecting to an existing browser tab, does that help somewhat in figuring out whats going on here?

pavelfeldman commented 2 years ago

Is https://www.bet365.com/#/HO/ the page?

It looks like your page has responsive design, so it visually depends on the window size. Try opening Chrome DevTools, selecting responsive mode and making it 1280 x 720. When I do that, pages are the same in Playwright and in the browser.

LikeMoneySecretly commented 2 years ago

Yeah that works, I noticed in puppeteer that it seems to auto resize when connecting to browser instances so I can see this being the issue. If you know puppeteer I had to enable msedge.exe to boot in dev mode using http://127.0.0.1:9222 and then connect to it and turn of the default viewport as it was completely messing up all browser tabs when connecting to the instance.

LikeMoneySecretly commented 2 years ago

As an aside after this issue has been closed, do you think this will work in headless mode? As headless mode at least from a superficial glance doesn't seem to require a resolution framing