Watch mode - Githubissues

lessp commented 2 years ago

I'm currently using redemon to watch for file changes[1], but perhaps watch mode is something that could be considered out of the box?

[1]: npx redemon --path=src yarn osnap

eWert-Online commented 2 years ago

I am open to other opinions on this, but I think, that snapshot-tests aren't really a good fit for a watch mode.

In the projects I am using OSnap, we are always working with more than 1000 Snapshots, which take around 1 minute to test. It is still fast, but I wouldn't want to wait minutes after every save of a file. That said, I also thought about a watch mode, which only tests the snapshots, which have failed the run before. But then a successful "watch" run could still have errors in untested snapshots.

What is your opinion on this? How did you imagine the watch mode should work?

lessp commented 2 years ago

Yep, I think that makes sense!

Obviously when I tested this out I only ran one test, so the example was pretty contrived.

So, the use-case for me was refactoring the styling approach for a component where it'd be convenient with some sort of watch mode to know when the design matches the original one instead of re-running the test manually and potentially missing a green run.

Perhaps semi-related, do you know where OSnap spends most time when running a test suite? Browser initialisation? In my naive approach using redemon I imagine it'd have to set up everything from scratch, so even though I only had one test it averaged around ~6-7 seconds per run. Curious if this could be reduced as well (by e.g. keeping things in idle).

EDIT: Realise I didn't really answer your question, so let me see if I understand you correctly.

That said, I also thought about a watch mode, which only tests the snapshots, which have failed the run before. But then a successful "watch" run could still have errors in untested snapshots.

So, because we're only running the failed snapshots and (hopefully) make them pass, we can't guarantee that it didn't break any of the previously non-failed tests and so it'd falsely report a successful suite. That's a good point. 🤔

I guess instinctively one solution would be to simply run the full suite after the failed ones have succeeded, but the UX may be questionable since you might be working on something for some amount of time only to realise at the end that you actually broke something else.

EDIT 2:

Another solution would perhaps be to approach it from another angle similar to only where you'd only watch a certain test suite. At least then you could get rid of the issue of users potentially falsely believing that all tests are passing because they've now instead selectively only pointed their attention to a specific suite. 🤷

eWert-Online commented 2 years ago

So, because we're only running the failed snapshots and (hopefully) make them pass, we can't guarantee that it didn't break any of the previously non-failed tests and so it'd falsely report a successful suite.

Yes, exactly 🙂

So, the use-case for me was refactoring the styling approach for a component where it'd be convenient with some sort of watch mode to know when the design matches the original one instead of re-running the test manually and potentially missing a green run.

So it would be more of a "watch until green" approach.

What do you think of the following idea:

OSnap gets started with some kind of flag indicating that it should now watch and rerun the failed tests
In the first run, it behaves like a normal test run.
If it doesn't find diffs, it exits like normal. If some tests do fail, the failing tests are being added to a "watchlist" and get rerun on every file-save.
If a test on the watchlist runs successfully, it is removed from the watchlist.
If there a no tests on the watchlist anymore, the user is asked to confirm a new complete test-run. Something like: "There are no more tests being watched. You should now run the complete test-suite again, to check for newly introduced diffs. Press any key to start a complete test run..."
The whole testsuite gets run again, which loops back to step 2.

you might be working on something for some amount of time only to realise at the end that you actually broke something else

Yes, but this is a problem which could only be solved with a watcher that runs all tests all the time, which isn't really an option in normal to large sized projects. And you currently also have this problem (even without some kind of watcher).

Perhaps semi-related, do you know where OSnap spends most time when running a test suite?

The Browser is definetly the bottleneck. Most of the time however is probably spent waiting for the website to load to completion. Also creating the diff-image is really slow currently...but I am already working on fixing this.

But yes, there is an intial statup time which could be saved when ran in some kind of watch mode:

collecting and parsing all the test-files (ignoring big folders like node_modules or other vendor folders is a good idea, to speed up this step)
starting the browser process and waiting for it to start the devtools api
creating the different "workers" (browser windows)
sorting the tests, so creating new snapshots is prioritized

lessp commented 2 years ago

So it would be more of a "watch until green" approach.

Yep!

What do you think of the following idea:

OSnap gets started with some kind of flag indicating that it should now watch and rerun the failed tests

In the first run, it behaves like a normal test run.

If it doesn't find diffs, it exits like normal. If some tests do fail, the failing tests are being added to a "watchlist" and get rerun on every file-save.

If a test on the watchlist runs successfully, it is removed from the watchlist.

If there a no tests on the watchlist anymore, the user is asked to confirm a new complete test-run. Something like: "There are no more tests being watched. You should now run the complete test-suite again, to check for newly introduced diffs. Press any key to start a complete test run..."

The whole testsuite gets run again, which loops back to step 2.

I think this sounds like a reasonable approach. 🙂

you might be working on something for some amount of time only to realise at the end that you actually broke something else

Yes, but this is a problem which could only be solved with a watcher that runs all tests all the time, which isn't really an option in normal to large sized projects. And you currently also have this problem (even without some kind of watcher).

👍

Perhaps semi-related, do you know where OSnap spends most time when running a test suite?

The Browser is definetly the bottleneck. Most of the time however is probably spent waiting for the website to load to completion. Also creating the diff-image is really slow currently...but I am already working on fixing this.

But yes, there is an intial statup time which could be saved when ran in some kind of watch mode:

collecting and parsing all the test-files (ignoring big folders like node_modules or other vendor folders is a good idea, to speed up this step)

starting the browser process and waiting for it to start the devtools api

creating the different "workers" (browser windows)

sorting the tests, so creating new snapshots is prioritized

Gotcha, that's helpful, thanks!

eWert-Online / OSnap

Watch mode #10