UI/UX is missing confusing / critical details

brian-mann commented 6 years ago

Issue description or question

I'm trying out Wallaby and I see enormous potential for it. I already bought a license and would love our entire team to adopt it... but am finding that I'm missing critical details that make it difficult to use efficiently.

Keep in mind I've used Wallaby for all of one day - I read through all of the docs, searched through many issues, applied it to an existing project (to compare against mocha performance) and here are my knee jerk reactions.

1. Failed assertions are not actually showing up in the IDE.

They show up at the top of the "Console" output and in the web app view, but oftentimes they don't show up at all inside of VSCode.

--Good--

screen shot 2018-07-02 at 8 30 57 pm

--Bad--

screen shot 2018-07-02 at 8 29 44 pm

Notice how the error is missing and it's not highlighted in the Output tab?

2. Full stack traces are missing

I really prefer always being able to see the full stack trace. It often helps debug 3rd party code that may be throwing the error. Even if its in a 3rd party node_module - I may want to go inside of that to understand the deeper details.

Here's what I strongly prefer to always see:

screen shot 2018-07-02 at 8 05 55 pm

I understanding wanting to hide core node stack traces (but to me even these are valuable) but hiding stack traces from installed 3rd party modules is too much.

As it stands the only way I found to find these is to get access to the error instance and then console.log(err.stack). The problem is that none of these stack traces end up being clickable in the preview window. So it oftentimes takes me much longer to actually get to the source of the error that I would have with a vanilla run in mocha in a regular terminal.

3. Run only on save

This alone is effectively going to prevent me from using Wallaby altogether. It's so unusable that I feel like this cannot possibly be the way it behaves for other users. I know it's been asked for on here by other users - and there does not appear to be any plans to fix it.

To quickly summarize what it "feels" like is happening, is that Wallaby is in an endless "lagged" feedback loop behind my code changes because its trying to run them 95% of the time before I'm ready for it to run them. Since it's running them prematurely it ends up displaying details that are immediately out of date.

Let's imagine I...

Go to change a test
Type a single character (that creates a syntax error)
Wallaby immediately starts to run
I immediately fix the error on the very next keystroke
Even though I've continued on, within about 2 seconds I get the feedback from the syntax error I created from step 2.
I'm immediately confused because I've already fixed the problem keystrokes ago. This is the feedback loop lag.
After a few more seconds, Wallaby reconciles all the various changes that happened through my keystrokes.
By the time Wallaby runs the correct code, it's literally seconds in the arrears, and the whole process has ended up taking longer than if it ran my change once when I saved the change to disk.

I've spent hours tweaking my config - going from full blown workers, to running only a single one, to trying out restarting, etc, to inserting delays, etc. None of it has made any difference.

The delays configuration come the closest to fixing the problem, but all it really ends up doing is creating race conditions that may still trigger, and worst, it ends up inserting an artificial delay that I really don't want in the first place.

All of this would be fixed if it just ran when I hit save. To me, that's the mental moment that I've said "commit this code to disk" and its when I expect things to immediately kick off. Things moving around whilst I'm typing makes it feel like Wallaby is sluggish and is not responsive to my feedback.

I understand the microoptimization that you may get so that you're receiving feedbcak on every single local change - that Wallaby can begin processing changes the moment it detects changes within milliseconds. The problem is - Wallaby cannot hold up to its end of the bargain. By rerunning multiple times on an unstable state it ends up taking longer than a single run that was stabilized.

The end result of this is that by the time Wallaby "catches up" to my latest code changes, I've stopped trusting it. I've at this point seen a string of errors that made no sense to me. By the time the real error is displaying, I'm confused as to whether or not this is "the true error".

Sometimes I see an error and I immediately disregard it because I think its out of sync. Then I wait for awhile, still don't believe the error, and then I check the wallaby logs to see if it's crashed. I might then move the lines of the test file around just so that I can see values change in order to prove to myself that it's running the latest version.

The time spent trying to understand whether or not a test is displaying the updated state or a real error and the lack of confidence that it is "correct" is debilitating. All other gains from this tool are wiped out.

That brings me to points 4 + 5 which I wanted to separate away from this specific one...

4. Workers appear to run endlessly without exiting

I thought Wallaby only ran the tests that changed. I have a suite of over 1000 tests in a project that take about 2 minutes to run in total.

This is what I'm seeing...

I put an .only on a single test and it runs only that one... great, this is what I expect.
I take off the .only and now all tests rerun... okay sure it's what I expect to also happen, but in vanilla mocha mode.
Now I put an .only back on the same test.

From looking at the Output tab, I can see that Wallaby is continuing to run all of my tests from step 2. This goes on for about 2 minutes until they are all done. I have no idea if this is the correct or intended behavior.

This brings me to point 5...

5. Confused whether Wallaby is working / running

Maybe this is lack of experience with the tool, but I find it incredibly hard to know when it is or isn't working. I rely on the feedback I get on a per line basis to get a feel for what the state of my tests are.

The problem is that when its running all of the tests, I don't see anything in VSCode at all. I've started to train my eye to look down to see the little running animation, but that's the only feedback I get.

When I go to work in a test, I'm constantly second guessing whether or not the test has run, whether its up to date, and whether or not I should be seeing something in the UI. That's because it's possible that I broke something that caused code not to be evaluated which then means that Wallaby's helper text will never show up in the IDE. I guess maybe this comes down to: "what I inherently expect the IDE to do, I don't always see". I'm oftentimes left wondering who's at fault here. Is it my test? Is it wallaby? Is it VSCode?

I'm not sure if this is possible (having never written an IDE extension) but I would prefer seeing the test's state inline the same way I see errors, console.logs and the test speed. I feel like the code coverage blocks are not nearly as important as just seeing that the test's state is or isn't what I expect it to be.

Let's look at this picture from earlier again... is there anything that is visually apparent of why / if this test is even failing?

screen shot 2018-07-02 at 8 29 44 pm copy

The problem is that the code coverage symbols overpower the arguably more important bits - which tests are actually failing!

The line number which defines the test is green, but the test itself is failing. I believe I'm going to change the default colors to help mentally separate out these concepts - but I believe that mixing up the green code coverage blocks with the mental expectation of a green test is confusing.

It's the same problem with uncovered lines. I believe Wallaby could do a much better job here effectively communicating the code coverage while also taking into account the test status.

The webapp does a phenomenal job of showing test states and code coverage, and I am extremely happy with it. But using VSCode I'm more confused than ever.

6. Logs from hooks are not showing up

I noticed that afterEach or after console.logs don't get accounted for in a output of a test. I could be wrong about this, but it's hard to even tell since I know that Wallaby hides specific things from me. It makes me second guess what it's doing.

7. Output is not colorized

This is another really bad user experience note. I totally understand why this works the way it is, and can't disagree that in most situations, it's correct... but...

This is what the output looks like in one of our projects...

screen shot 2018-07-02 at 7 55 44 pm

This is how I expect it to look...

screen shot 2018-07-02 at 9 28 08 pm

Here's another context of how it looks during another test...

screen shot 2018-07-02 at 9 29 38 pm

These colors and outputs are really essential to our tests. We use them in this project to inspect all of the SQL queries and have colored them accordingly based on a number of factors.

The problem is that the Output tab in VSCode is not colorized to what we expect. The other problem is that why its sometimes useful to see the stack trace of a console.log... that is the exception not the rule!

Anyone building any kind of CLI tool (which we do) needs to see their Output without being modified by Wallaby. We would never be able to introduce this into anything that utilizes console.log as a feature. This is IMO a common use case in node.js applications.

screen shot 2018-07-02 at 7 54 55 pm

I simply just want to be able to disable Wallaby embedding the line into my console logs. I want to see my console output exactly as how it's supposed to look without ANY modifications.

For instance, the output on the webapp is far more useful and is not hijacked.

screen shot 2018-07-02 at 7 58 35 pm

However, per my explanation above, this still isn't good enough. It's still modifying the spacing of the logs and they're also not colorized. Having the color in the web app would be really nice. You can use some simple ansi to HTML converters to do that. I tried running wallaby against tests for CLI apps we've written and they're completely unusable.

The error message (while looking great) is still missing its stack trace. In the web app you could hide it by default, but let me click a link to display it like how Atom does it when there's an uncaught exception from an extension.

screen shot 2018-04-27 at 12 36 18 pm

In Cypress - we also hide the error details from you by default, but clicking on the error allows you to interact with the stack traces directly, which is useful.

8. Can I just get the "Output" to show up in the Terminal?

I have no idea if this is possible, but when I googled around trying to figure this out, it appears as if VSCode just wants you to run things in the integrated terminal. If this was the case then I'd be able to get my colors, and also source lines would become clickable now. As it stands, I can't click any output. Here's an example...

screen shot 2018-07-02 at 9 38 21 pm

Final Thoughts

As a final side-note I'd be more than happy to show you what my workflow looks like under these conditions. I totally accept that I could be doing something wrong, or that there are various conditions that are not ideal given some of the assumptions Wallaby makes about the environment it's running in. I'm very much in the corner of wanting to see this tool succeed!

Code editor or IDE name and version

Visual Studio Code v1.24.1

OS name and version

OSX

ArtemGovorov commented 6 years ago

Hi Brian,

First of all, thanks a lot for your detailed feedback! We really appreciate the level of detail you've provided when explaining the issues you've hit.

It may take me a couple of days to fully respond to your feedback and to create a few separate issues based on it. But I'll start today with addressing some (1-5) of the raised issues.

1. Failed assertions are not actually showing up in the IDE.

Whenever Wallaby has a stack trace that contains the failing test, we are able to show the assertion in the editor. However, supertest (that you seem to be using) is known for throwing errors that stack doesn't contain the failed test line(s).

This screenshot of yours demonstrates the issue:

screen shot 2018-07-02 at 8 05 55 pm

As you may see, there's no pointer in the stack to any of the executed test lines.

2. Full stack traces are missing

TBH, you are the first person to request showing node/node modules stacks, but we can definitely see the value and will create a separate feature request for it.

3. Run only on save

There's a feature request for it, but I have some questions regarding your specific scenario:

Type a single character (that creates a syntax error)

Wallaby immediately starts to run

I immediately fix the error on the very next keystroke

Even though I've continued on, within about 2 seconds I get the feedback from the syntax error I created from step 2.

When wallaby encounters a syntax error (meaning that it can't even run tests), it should respond almost instantly (by not running any tests). Wallaby doesn't even show syntax errors in editor (we only show runtime errors). For syntax errors, editor built-in tools/parsers are doing a much better job when displaying those.

All of this would be fixed if it just ran when I hit save. To me, that's the mental moment that I've said "commit this code to disk" and its when I expect things to immediately kick off. Things moving around whilst I'm typing makes it feel like Wallaby is sluggish and is not responsive to my feedback.

While I don't argue about the usefulness of the requested feature (and I can ), the tool's idea (that had been greatly working for many of our users) is to remove the need in that mental moment (or minimise the number of those moments) when you actually need to stop and wait.

What I would love to ask you is to try using the tool for a few days in its current mode, and see if your perception changes. It may not, because of some other factors affecting it, but I would love to hear back from you, if anything changes or not in the way you use the tool (perhaps based on some other answers to some bits of your feedback that I have provided).

4. Workers appear to run endlessly without exiting

I thought Wallaby only ran the tests that changed. I have a suite of over 1000 tests in a project that take about 2 minutes to run in total.

Yes, normally Wallaby only runs tests that changed (or affected). To find out what tests are affected, we use dependency analysis of runtime data that we collect when tests are running.

However, when you add .only in one of your test files, or use a special comment to only run all tests but only from a specific test file(s), then Wallaby starts running only that test. It means that you can potentially go and make some changes, anywhere in your codebase, and those changes can break other tests, but they are not running yet.

So when you remove .only, Wallaby needs to run all of your tests, because it didn't have a chance to run them incrementally (and only those that are required), to make sure that your changes didn't break anything.

I do definitely see some room for some heuristic based improvement - not (re)running all tests if while wallaby was in the .only mode, and only that one test file had changed (and no other test files or source files) before removing .only. It may however be common that one would also be changing source files while in the .only mode.

There's also a way to tell Wallaby not to enter the .only mode by using the automaticTestFileSelection: false setting. With the setting value set to false, even if you have a .only test in one spec, Wallaby will not stop running other tests in other specs if you change some code that is covered by them. It will address the issue with the full test re-run (there'll be no full reruns), The downside is that you will always see the combined code coverage from other test file(s) tests in your source file, even if you are focussing one in one of your tests (that covers the source code).

Hope it makes sense.

5. Confused whether Wallaby is working / running

There's one easy way to tell if Wallaby is running or not by checking Wallaby status indicator (in the bottom right corner of the IDE). If you see tests stats, Wallaby is not doing anything (because of an error, or because everything is good), if you see a progress bar - it's running your tests.

I'm not sure if this is possible (having never written an IDE extension) but I would prefer seeing the test's state inline the same way I see errors, console.logs and the test speed. I feel like the code coverage blocks are not nearly as important as just seeing that the test's state is or isn't what I expect it to be.

Normally, whenever there's an error (and a test is failing) there's a red expectation failure displayed right in the test. Also, code coverage blocks are changed to pink (meaning that the pink lines are on the execution path of a failing test). So even if there's no big red expectation failure displayed right in the test (like in this unfortunate case with supertest), whenever you see some pink lines in a test - the test is failing.

ArtemGovorov commented 6 years ago

Also, any chance you're trying Wallaby on some repos that are public, so that we could also try them with your current config?

ArtemGovorov commented 6 years ago

Regarding the supertest issue with the stack, there's a workaround to get the stack and also an exiting PR in supetest repo, perhaps some of these solutions can help.

brian-mann commented 6 years ago

I will do as you request and continue to let you know my thoughts.

I need to spend some time trying it on a myriad of our existing projects to flex it in different ways. So far I've tried it on a CLI tool, and a regular express API.

I understand the supertest issue as you've stated - however it seems it's likely possible to determine that a test failed and that error's stack does not point to the lines - and yet still associate it. You may not be able to point it to the right line number, but you could still print the error on the line of the test title itself (since you know that one).

That would likely fix all cases of bad actors like supertest.

I'll respond to your other questions as well when I have a chance. I'll see what I can do about putting together public repos. We have plenty of them, but so far I've only used it on private stuff.

Regarding showing the stack trace - as long as I had the option to expand it (either in the Output tab) or worst case the web gui I would be happy. It's one of those situations whereby the stack trace itself is oftentimes not that useful, but that is an important piece of debugging itself - when its not useful (because its from node core) that tells you a lot of information as is. Being familiar with node core also helps me pinpoint down the intended path I believe the error should be originating from. Seeing the stack trace helps me confirm that suspicion. Not being able to immediately see the stack trace means that I can't be 100% sure of its origin.

It's one of those things that by itself doesn't immediately make it clear why the error is happening, but it does eliminate a huge swath of potential avenues by helping you understand where its not coming from.

ArtemGovorov commented 6 years ago

Thanks!

Created separate issues for

point 2 https://github.com/wallabyjs/public/issues/1745,
point 1 (supertest issue) https://github.com/wallabyjs/public/issues/1746,
points 7-8 https://github.com/wallabyjs/public/issues/1744.

brian-mann commented 6 years ago

Solid.

Wanted to report back on an update. Been knee deep in Wallaby the last few days. After understanding some of your points and spending more time with it I can positively report back that it looks like it's going to work for us. I'd like to roll it out to our whole team.

With that said I have a new laundry list of improvements, thoughts, and suggestions :-)

@ArtemGovorov I'm going to shoot you an email here in a second. Maybe we can find some time to do another screenshare and I can show you some of the stuff we're working on and how we're using Wallaby.

brian-mann commented 6 years ago

Also for anyone else - by switching to console.error(...) as opposed to console.log(...) I was able to bypass the stack trace hijacking which enables us/me to see just the logs for that appropriate test.

It looks like this...

It's still not quite as ideal as our previous setup with Mocha, but the fact that Wallaby stores all the logs per test is extremely helpful.

This is our previous setup...

Ideally if we could get the spacing + the colors worked out, I'd be happy. We considered automatically redirecting console.log or console.error to a logfile and then tailing it in order to get our colors back - but then that would only work when running a single test in exclusive mode.

wallabyjs / public