phetsims / aqua

Automatic QUality Assurance
MIT License
2 stars 4 forks source link

Should we use Qunit instead of our current test-sims.html? #16

Open zepumph opened 7 years ago

zepumph commented 7 years ago

From discussing phet-io testing with @samreid in phetsims/phet-io#158. Qunit has been good for testing accross the project and can offer more specificity than just the colored squares for each sim. If all of our tests are built on Qunit, rather than most of them, it may be easier to automate reporting, like with continuous automated testing run on Bayes. Maybe we should investigate using Qunit tests instead of test-sims.html. Tagging @jonathanolson for an opinion, and marking for dev discussion.

Here is a skype chat that also started this conversation:

[11:44:09 AM] Sam Reid: What do you think about changing the automated fuzz tests to using QUnit instead of our own grid of colors? [11:45:30 AM] Chris Malley: Do you mean the colored boxes in test-server? Only one of those boxes pertains to fuzz testing. [11:46:07 AM] Sam Reid: Well, one is for fuzz-testing the requirejs mode and one is for fuzz-testing the built mode, right? [11:47:00 AM] Chris Malley: None of the colored boxes is specifically for fuzz testing. [11:47:33 AM] Chris Malley: left is running requirejs (with optional fuzz), center is build, right is running built (with optional fuzz). [11:47:56 AM] Sam Reid: What if instead of a colored grid of boxes, it was a QUnit report? [11:47:59 AM] Chris Malley: if you omit ?fuzzMouse, then you can just test loading. [11:49:17 AM | Edited 11:49:30 AM] Chris Malley: I guess that (qunit) would be OK. [11:50:16 AM] Chris Malley: what's the motivation? [11:51:12 AM] Sam Reid: Michael and I are thinking about how to test PhET-iO wrappers, and choosing to go down the path of [a] building something that looks like our color grid or [b] doing something with QUnit that gives finer-grained error reporting. [11:51:31 AM] Sam Reid: I think for the PhET-iO we need the finer-grained error reports. [11:51:42 AM] Sam Reid: If we used the color grid for that, it could be many many columns. [11:51:55 AM] Sam Reid: So we started thinking why not render the main tests through QUnit as well? [11:52:08 AM] Sam Reid: For instance, you could check the item that says “show only failed tests”. [11:52:14 AM] Sam Reid: or “re-run only failed tests” [11:52:17 AM] Chris Malley: test-sever is generally a bit confusing until you get familiar with the colors. and the error messages don't clearly identify which test phase failed. [11:52:27 AM] Sam Reid: I agree. [11:52:46 AM | Edited 11:52:54 AM] Sam Reid: QUnit could be clear: e.g. “Faraday’s Law failed to build with this error [...]"

samreid commented 7 years ago

One feature I would love to have (which QUnit provides) is to re-run failed tests. I think we should try a recent version of QUnit for our sim tests, and upgrade all old QUnit tests to use the new version.

jbphet commented 7 years ago

QUnit is widely used and pretty well supported, so if we can migrate our automated testing to it with a reasonable amount of effort, I certainly wouldn't have any objections.

pixelzoom commented 7 years ago

Another reason to replace test-server with qunit is that test-server has been know to show errors with no associated messages -- see for example https://github.com/phetsims/chipper/issues/532#issuecomment-269041900.

jonathanolson commented 7 years ago

QUnit doesn't handle the complex ordering that test-server does. I'd like to hear a more detailed plan of how things would work with QUnit first.

Another reason to replace test-server with qunit is that test-server has been know to show errors with no associated messages -- see for example phetsims/chipper#532 (comment).

How would QUnit help with this?

jonathanolson commented 7 years ago

QUnit doesn't handle the complex ordering that test-server does.

To elaborate, it would be trickier to limit to X number of concurrent builds, and handle the test queue so that we only test a built sim after it is built. We'd require new queuing logic on the (preferably) server side, and the individual QUnit tests would need to poll the server to see the status of builds (can't/don't want to keep a client-server connection open that long).

QUnit's done() listener (at least at the older version) does not only fire once (when all tests are complete), but fires multiple times, so it was more difficult to integrate into my continuous testing prototype (https://github.com/phetsims/aqua/issues/15).

The main benefit of QUnit (to me) is a simple report interface and a good library of functions that make written tests easy to read and understand.

Upgrading QUnit (probably won't take too long to migrate, but affects many tests) sounds good, and rewriting the test-sims/test-server to use QUnit should be possible. Getting rid of the custom test view logic sounds good, so it may be a good thing to do.

jonathanolson commented 7 years ago

Additional note: just for cleaning up globals usage, we should upgrade QUnit (since we'll just need one global instead of tons).

jessegreenberg commented 7 years ago

rewriting the test-sims/test-server to use QUnit should be possible.

I don't know how much work this would take, but the benefits listed in this issue make it seem worthwhile, I have no objections to switching to QUnit.

samreid commented 7 years ago

@ariel-phet recommended this be something @jonathanolson can work on once Bayes is ready to go.

pixelzoom commented 4 years ago

@samreid do you know what the status of this issue is? Is there repos where we should be using QUnit and are not? Can we create repo-specific issues for those?

samreid commented 4 years ago

In discussions with @zepumph regarding PhET-iO automated testing, I recall we were considering running two types of tests, fuzzes and qunit tests, and not trying to embed the fuzzes in qunit harnesses. @zepumph is that your recollection as well?

zepumph commented 4 years ago

Yes, over in https://github.com/phetsims/aqua/issues/76 we want to pull wrapper fuzzing out of QUnit, and put them into a "test-sim" like architecture. I'd say this issue is likely out of date and not the best direction anymore, especially noting the concerns that @jonathanolson mentioned above. Should we close this @samreid?

samreid commented 4 years ago

I think it warrants further discussion, I can see the advantage of leveraging QUnit for the test presentation & re-running failed tests (I know we implemented this separately in test-sims.html).

samreid commented 4 years ago

It seems we need to have a larger discussion about how our tests are organized and run, there are also some notes in https://github.com/phetsims/aqua/issues/76#issuecomment-569433621. But it seems like this issue can be unassigned for now, until it is higher priority or until we are working on aqua again.

samreid commented 4 years ago

See also https://github.com/phetsims/phet-io-wrappers/issues/217 with some good reasons fuzz tests should not run in a qunit harness.