phetsims / aqua

Automatic QUality Assurance
MIT License
2 stars 4 forks source link

Investigate tools for automation in testing #163

Open samreid opened 1 year ago

samreid commented 1 year ago

On slack #dev-public, @liammulh said:

I was thinking of automating some of the really tedious QA tasks, but if all of the buttons are drawn graphically, I think it would be hard to do that.

I said:

Would love to hear more about your ideas for automating tedious QA tasks.

@liammulh said:

When I was reviewing the test task for CCK on GitHub, one of the instructions is "click every button". This seemed really tedious to me, so I was thinking it might be nice if we could automate verifying that some of the buttons work. In my other project we use a tool called Cypress to run end-to-end tests for our web app. I was thinking of writing some simple tests in Cypress for checking stuff like the PhET menu opens up when you click it, the PhET home page link works, etc.

I said:

That would be really nice. I’ve also heard of Selenium. And some teams use Puppeteer for this. But I’m unclear how you describe the expected behavior for those test harnesses (for sim-specific buttons).

@liammulh said:

Right, I think it would be hard to describe behavior for something complex like sim-specific buttons. If we could get a reference to them, do they emit events when clicked?

I said:

I would like the ability to run through QA testing fully on one platform and see how the sim behaves and “record” that as a session. Then play that back those inputs on all other platforms and see if they give the same behavior. That would help with the cross-platform part of testing. What if we used a program like https://www.macrorecorder.com/ and record a “qa session” on one platform. Then play it back on other platforms to make sure the behavior is the same?

Then I wrote in the QA channel:

In #dev-public @Liam and I were talking about improving automation for QA testing. I saw there is a new class of tooling that may be able to help us here. Ideally we could record the expected behavior on one platform, and automatically see if all other platforms give equivalent behavior. Several tools use AI and image processing to (a) detect what to click and (b) to see if the result is the same. Things like:

https://katalon.com/visual-testing https://applitools.com/ https://www.testim.io/test-automation-tool/ https://smartbear.com/product/testcomplete/ https://www.selenium.dev/selenium-ide/ https://testproject.io/

What do you think? Should we look into it? Should we schedule a goal for next sprint to take a closer look?

@marlitas said:

I think this is a really interesting and great conversation. Reading through the dev-public thread it does seem like there are some straightforward ideas in speeding up QA testing, and could help relieve backlog. I'm curious to hear what QA team members think. I'd wonder if there are some quirks shifting between different platforms that a person might catch, but AI might have a hard time with?

Nancy-Salpepi commented 1 year ago

I am all for trying something that may speed up QA testing and reduce bottleneck. I like the idea of being able to test/record on one platform and then playback on others to help with cross-platform testing. Aren't these tools costly though?

liammulh commented 1 year ago

What if we used a program like https://www.macrorecorder.com/ and record a “qa session” on one platform. Then play it back on other platforms to make sure the behavior is the same?

This is crucial. If we can't do this, the new class of AI visual testing tools won't be much help.

liammulh commented 1 year ago

I gave Selenium IDE and Datadog's Synthetic Monitoring a try, and it looks like they both assume the presence of unique, identifiable DOM elements.

zepumph commented 1 year ago

Just tagging https://github.com/phetsims/phet-io-wrappers/issues/361 as a paper trail on how we have wanted to do this for PhET-iO stuff for a while.