microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
67.35k stars 3.71k forks source link

[BUG] Codegen changed locator strategy ignores ID's #19439

Open gselsidi opened 1 year ago

gselsidi commented 1 year ago

So the new version of playwright change the codegen locator strategy to webFirst such as .byRole, byName, One it completely ignores ID.... which the old selector method was capable of capturing.

byRole > byName is ultra useless for an element that changes text content, where we have an actual ID set to that element. How can we get the old behavior back of the codegen?

Is there an option to choose a recorder strategy?

yury-s commented 1 year ago

@gselsidi can you share a concrete example where it breaks (to help us better evaluate the problem) ?

gselsidi commented 1 year ago

What do you mean? Older versions of playwright the recorder would grab locators like.

Locator( div >> div >> something)

now the recorder grabs.

.getbytext .getbyrole( name: “foo”)

it changed the method at how it’s saving elements. Before it was strictly locator() now it’s these web first methods.

gselsidi commented 1 year ago

I don’t know if this is a bug, it seems like I said the strategy of dealing with elements was changed.

These new methods don’t have a getbyid method.

So how can we change back to the old strategy for the recorder?

Attic0n commented 1 year ago

It would be nice to have a way to prioritize the locator strategy of the codegen, and exclude some. Text or id are pretty useless when they are not unique, or dynamic.

gselsidi commented 1 year ago

@yury-s easiest way to reproduce would be click on a button that has an id. You will see the locator strategy will record it as.

byRole(button, name: text) instead of byRole(button, id: id)

Also if you want send me your email I can screen-share with you to repro, as I can't upload our whole code base here.

yury-s commented 1 year ago

It is our deliberate choice to recommend the new locator APIs as they serve better most of the users and help to avoid many bad practices (you can watch our release video to learn more about why we prefer ones over the others). Element ID is not considered to be a good locator as in many cases it's just a generated string which will also change over time. The generator does prioritize data-testid attribute above most others though, if the element has such an attribute the generator will produce getByTestId locator. The difference compared to id is that data-testid is supposed to be stable and maintained by the developers as the application evolves.

The generated code is the way we (Playwright team) recommend writing your locators but it is not the only wayand you can always edit the generated code to your liking, replacing getByRole etc with locator('id=..').

I will keep this bug open to see if there is more interest in such strategies, but we don't have any immediate plans to change this behavior.

gselsidi commented 1 year ago

@yury-s that's a very faulty assumption both test-dataid and id are equal if they are dynamic or hard coded depends on the framework and how they are set up. Usually developers maintain id not test-dataid. I've seen react have both dynamic id's and test-dataids also.

Both should have the same priority id, test data id, then role label ect.

Our Id is hard coded, whereas get by role, name: is dynamic and basically a useless locator, whereas regular id would of been a proper one.

  1. id should have the same priority as data-test id, def more priority than any of the getBy methods.
  2. allow users to choose a strategy on how the code gen grabs code, whether with these new formats or the old CSS format
gselsidi commented 1 year ago

If you're using Page Objects, here is the simplest solution we've used:

  1. Import 'selectors' from playwright/test
  2. add this setting: selectors.setTestIdAttribute('id')
  3. use page.getByTestId('your id here')

does this apply to the recorder? also any way to set it globally?

mprykhodko commented 1 year ago

If you're using Page Objects, here is the simplest solution we've used:

  1. Import 'selectors' from playwright/test

  2. add this setting: selectors.setTestIdAttribute('id')

  3. use page.getByTestId('your id here')

does this apply to the recorder? also any way to set it globally?

I actually spent more time today using this setTestIdAttribute and realized it does not actually change from default 'data-testid' to 'id' I tried to apply it on 'test' module level by adding config property and no luck as well. You're right, it seems to be a global config property but it won't work for us so we reverted to legacy way ...locator(id=myId... For global, the believe this is the setting:

// Set custom test id attribute from @playwright/test config: use: { testIdAttribute: 'data-pw' }

gselsidi commented 1 year ago

@mprykhodko that works thanks, but we have actual data-testids in some parts but it's good to know. @yury-s the simplest solution would be to add getById() as another method and prioritize them both.

ZooDoo4U commented 1 year ago

Would really like to see getById()... An element's ID and date-testId are and should be two different things, each with different semantics. If data-testId is the same then would assert remove it as we have two pieces of data doing the same exact thing. Ah, but they are different. First is the HTML spec ID's must be unique, data-testId not... And i've worked on contract as a tester where we did test for the id. But data-testId don't require uniqueness, so they absolutely can't be the same. Not sure when did the data-testId's go through a formal W3C standards/peer review... Given we are automating web pages using and conforming to the HTML might be the best thing, it has and supports element ID's. For convenience here is a link to the HTML spec... https://www.w3.org/TR/2011/WD-html5-20110525/elements.html#the-id-attribute

Honestly was surprised when PW introduced the changes for the getByXXX(). Awesome, but where in the hell is getById(). Some may not use it, find. But many do and prefer it. Would argue those arguing against not using ID is at best a subjective position, so if one doesn't want to use them great don't those who would prefer based on their needs great us it... IIRC, locators were made 'strict' around 1.13 which would be right in line with how getById() should work... Honestly there is no rational reason why there is no getById()?

Furthermore, the recorder && codegen... Sure wish it makes up it's mind... At first when raising issue on it's functionality, was told "It is only suppose to be a tool to learn how to use locators..." I had opened an issue prior asking uses be able to influence how the recorder selects elements. Might assert that is now needed all the more? Really no reason not to make codegen a but customizable if possible. Would assert there should be an order of precedence users should be allowed If there are multiple i.e.

Would assert codegen honor what the user's preference is...? Also a discussion from the PW Discord got me to post here... (https://discord.com/channels/807756831384403968/1055818559210655796)
gselsidi commented 1 year ago

Would really like to see getById()... An element's ID and date-testId are and should be two different things, each with different semantics. If data-testId is the same then would assert remove it as we have two pieces of data doing the same exact thing. Ah, but they are different. First is the HTML spec ID's must be unique, data-testId not... And i've worked on contract as a tester where we did test for the id. But data-testId don't require uniqueness, so they absolutely can't be the same. Not sure when did the data-testId's go through a formal W3C standards/peer review... Given we are automating web pages using and conforming to the HTML might be the best thing, it has and supports element ID's. For convenience here is a link to the HTML spec... https://www.w3.org/TR/2011/WD-html5-20110525/elements.html#the-id-attribute

Honestly was surprised when PW introduced the changes for the getByXXX(). Awesome, but where in the hell is getById(). Some may not use it, find. But many do and prefer it. Would argue those arguing against not using ID is at best a subjective position, so if one doesn't want to use them great don't those who would prefer based on their needs great us it... IIRC, locators were made 'strict' around 1.13 which would be right in line with how getById() should work... Honestly there is no rational reason why there is no getById()?

Furthermore, the recorder && codegen... Sure wish it makes up it's mind... At first when raising issue on it's functionality, was told "It is only suppose to be a tool to learn how to use locators..." I had opened an issue prior asking uses be able to influence how the recorder selects elements. Might assert that is now needed all the more? Really no reason not to make codegen a but customizable if possible. Would assert there should be an order of precedence users should be allowed If there are multiple i.e.

Would assert codegen honor what the user's preference is...?

Also a discussion from the PW Discord got me to post here... (https://discord.com/channels/807756831384403968/1055818559210655796)

This exactly, I don't understand how decisions that go against web standards are applied so liberally without any apparent discussion or thought. You can make arguments about codegen being a learning tool and we left it out, but having no getbyID function in the broader API makes absolutely no sense and is against known standards.

gselsidi commented 1 year ago

broke getbyid into a new ticket, whether codegen changes are implemented or not, a getbyID function should be part of the API.

https://github.com/microsoft/playwright/issues/19763

gselsidi commented 1 year ago

Also it seems when none of the getbymethods apply the codegen falls back to .locator, or it randomly decides to use .locator i don't know. The beauty is even when it decides to use .locator it completely ignores id's and goes for flaky locators like attribute name.

It's like it was decided to erase id's from existence and history lol

edmund-need2know commented 1 year ago

Hilariously, I have literally the opposite problem to OP, yet want the same solution. Our framework gives all input fields a unique id that's generated at build-time like #input-423 which PW is prioritising with its selectors, even when an explicitly data-testid is specified.

Every time our test-runner spins up to run tests, all those id's are regenerated and no longer work, so code-gen for all form-fields is effectively broken.

ZooDoo4U commented 1 year ago

@edmund-need2know Might assert this isn't the same issue, as your code is generated at build time doesn't matter the search strategy, or method used be it id or data-testId. Would assert most front end code is written by a person vs. machine generated. Would think unless the code generator creates and links up fields as needed you'll need to figure out some way to say "#input-423" is first name. Be it some attributes or other meta data the generator can consume to provide a mapping layer, seems you have a lot of fun getting this to work.