Discuss relevant details for main purpose of a snapshot

Jest has a challenge and an opportunity with snapshot testing:

…so much diff for each code change that you wouldn’t see the actual bug

…evolve patterns over time and figure out the best balance

https://twitter.com/cpojer/status/774427994077048832

To decide which ways to move forward… (for example, more traversal API: https://github.com/facebook/react/issues/7148#issuecomment-230937087) …let’s begin with end goals for effective snapshot testing.

Improve developer experience while testing and reviewing code changes:

Increase snapshots for which developers and reviewers actively expect the results (to change or not to change :) and decrease snapshots for which they passively approve the results.
Decrease details that are unrelated to the main purpose of a snapshot to decrease complicated decisions and risky updates because it has both irrelevant and relevant changes.
Increase clear understanding between developers of what a snapshot claims about correctness, throughout its lifecycle: creating, updating, reviewing.

Hypothesis to discuss: accurate decision = clear purpose + definite expectation + relevant details

Here are 3 use cases for y’all to challenge or confirm.

Examples of snapshots are for the following components:

Content: only text

A basic component specification is the text that people see or hear for a realistic variety of props and state. Developers and reviewers can usually expect whether or not a change to code intends to cause a change to text.

Could we balance snapshots that make claims about everything…

exports[`Todo renders props 1`] = `
<li
  onClick={[Function]}
  style={
    Object {
      "textDecoration": "none",
    }
  }>
  Install Jest
</li>
`;

…with some snapshots that make claims about only text content?

What do you think about an array of text nodes? Does description tell you which details are included?

exports[`Todo renders text prop (only text) 1`] = `
Array [
  "Install Jest",
]
`;

exports[`TodoList renders text props of todos (only text) 1`] = `
Array [
  "Install Jest",
  "Create a directory",
  "Add the following",
]
`;

Structure: without functions

Depending how a component specifies style, changes to markup can affect how people see content.

If the purpose of a snapshot is to make a claim about structure, then it has irrelevant changes when event handlers onMouseEnter and onMouseLeave are added to improve interaction.

exports[`Link renders markup of children 1`] = `
<a
  className="normal"
  href="http://facebook.github.io/jest/"
  onMouseEnter={[Function]}
  onMouseLeave={[Function]}>
  <p>
    <strong>
      Jest
    </strong>
     is a JavaScript testing framework, used by Facebook to test
    <em>
      all JavaScript code
    </em>
     including React applications.
  </p>
</a>
`;

What do you think about a snapshot without functions? Does description tell you which details are excluded?

exports[`Link renders markup of children (without functions) 1`] = `
<a
  className="normal"
  href="http://facebook.github.io/jest/"
  <p>
    <strong>
      Jest
    </strong>
     is a JavaScript testing framework, used by Facebook to test
    <em>
      all JavaScript code
    </em>
     including React applications.
  </p>
</a>
`;

Bonus thought: accessibility baseline = content + certain details about structure?

Interaction: only diff

A way to test that a component “works the way it should” is take snapshots for states during an interaction. But the relevant differences are needles in a haystack of irrelevant details. Especially if presence of irrelevant change hides absence of relevant change. For example, a developer might incorrectly update a snapshot after refactoring when a class name didn’t change, but should have.

exports[`Link changes class when mouse enters and leaves 1`] = `
<a
  className="normal"
  href="http://www.facebook.com"
  onMouseEnter={[Function]}
  onMouseLeave={[Function]}>
  Facebook
</a>
`;

exports[`Link changes class when mouse enters and leaves 2`] = `
<a
  className="hovered"
  href="http://www.facebook.com"
  onMouseEnter={[Function]}
  onMouseLeave={[Function]}>
  Facebook
</a>
`;

exports[`Link changes class when mouse enters and leaves 3`] = `
<a
  className="normal"
  href="http://www.facebook.com"
  onMouseEnter={[Function]}
  onMouseLeave={[Function]}>
  Facebook
</a>
`;

What do you think about snapshots of only differences between rendered states? Does description tell you which details are included?

exports[`Link changes class when mouse enters and leaves (only diff) 1`] = `
-  className="normal"
+  className="hovered"
`;

exports[`Link changes class when mouse enters and leaves (only diff) 2`] = `
-  className="hovered"
+  className="normal"
`;

This first draft format needs improvement: when there is a change, a diff of a diff doesn’t seem clear.

What next?

Of course, if this issue isn’t helpful, you can close it with no hard feelings :)

Otherwise, let’s imagine, discuss, and then describe an improved developer experience.

Is it realistic for developers and reviewers to understand that some snapshots have a specific purpose? If yes, can we seek a consensus about common patterns? Do we dare to seek concise descriptions for clear understanding?
Is it realistic for developers and reviewers to have a more definite expectation whether or not a snapshot changes, given the purpose of the snapshot and the change to the code? Does a decrease in irrelevant changes enable an increase in accurate and confident decisions?
Is it realistic for developers to specify which details are relevant to be included or irrelevant to be excluded in a purposeful snapshot? I am intentionally leaving wide open how to implement such a goal, until we discuss its worth and scope.

Moar use cases are welcome!

Oh this is great timing. I'm literally working on Jest's plan for the first half of 2017 at Facebook right now and we are very curious about what improvements we can make to snapshot testing. Is this something that you are specifically interested in exploring and potentially working on?

Two of the things I'd like to see done are:

Add a mode that requires user input to accept new and updated snapshots.
Add a configuration option to prevent writing snapshots and fail instead. This should be turned on in CI systems to catch tests that don't have a committed snapshot associated with them.

Yeah, putting the ball in play seemed worthwhile. What I can offer now while people are exploring is a viewpoint what seems realistic for an ordinary working dev to set up, maintain, and review.

This came up while I am adapting a subset of Whinepad from Stoyan’s book for a potential meetup presentation about Refactoring confidently with checks and tests (Flow and Jest). Let’s keep it as an open possibility that I might learn enough to work with y’all on it in some way. My mascot is a turtle, if you need a mental picture :)

👋 I thought I'd chime in with responses to some of the comments above and other feedback from my experience with snapshot tests.

By using Enzyme selectors, I have no (or very little) difficulty targeting specific parts of the tree for a focused snapshot, when I want one (e.g. only text, only a subtree, only one component). I wonder if providing options in Jest's assertions to target only text, or to target other specific parts of the tree, is therefore unnecessary — unnecessary complexity, an unnecessarily larger API. It would save Jest a lot of complexity to outsource this selecting & targeting functionality to another library — or it could include its own selector library — either way, the functionality of selecting parts of tree could be kept distinct from the assertions.
@pedrottimark What do you think is the danger of having snapshots that are too large? What is the motivation for excluding parts? Jest already does a great job displaying a diff that shows you the specific location of a divergence — and version control diffing tools will also do that, at least as well — so why is it a problem if, say, four test cases produce thousand-line snapshots that only vary in one line? Performance? Or is it that you don't want to test parts of the UI. If that's the case, why not? Would improvements to the process of updating snapshots make you less concerned about large snapshots?
I agree that it would be very helpful to somehow display the differences between snapshots. I'm not sure this would need to be a part of the tests themselves, though: what if there was a separate tool that you could use to compare/diff snapshots? (Within the same .snapshot file would be most important, but maybe there'd be use-cases for comparing across files?)
@cpojer Both of your suggestions would be very helpful. The first, I think, is especially important: I've found it a little too easy to accidentally update more snapshots than I intended to; and I've struggled when a number of large snapshots changed, and I need to review each one to decide which should be updated, which were mistakes.
Finally, of course, anything that can be done to speed up snapshot tests would be excellent.

Thanks!

@davidtheclark Thank you for feedback from your experience. That is what we want.

I've found it a little too easy to accidentally update more snapshots than I intended to

✋ mee too

What do you think is the danger of having snapshots that are too large?

Important question: if I gave impression that size itself is the danger, then I miscommunicated.

What is the motivation for excluding parts?

For some snapshots to include all details as a baseline makes sense to me: I can expect and welcome the frequent diff to review. As someone said, green-red-green is like TDD inside out.

But before snapshots, most tests made claims about specific details and intentionally overlooked changes to irrelevant details.

If too many snapshots require people to verify relevant changes and at the same time disregard irrelevant changes, then it seems like a half step forward and a step backward. Less traversal logic to write initially, but now people implement it repeatedly.

Please don’t mistake me: I am a big fan of snapshots! But see a following comment about cognitive overload. Too many false positive changes, also known as crying wolf, is another way to describe the problem that concerns me. Again, sorry for not making the clearer in the original comment.

By using Enzyme selectors… unnecessary complexity… It would save Jest a lot of complexity to outsource…

Yes, these are wise constraints on potential solutions. Less is more, for sure.

If you are already using Enzyme to select what you expect(…).toMatchSnapshot() then good for you! Some recipes on the Jest web site and up-to-date blog articles might be all it takes.

if I gave impression that size itself is the danger

I didn't mean "large" in terms of filesize, but large in terms of the quantity of data in the snapshot. So my question "What is the danger of having snapshots that are too large?" was supposed to be essentially the same as the question "What is the motivation for excluding parts?" Sorry if that was unclear.

But before snapshots, most tests made claims about specific details and intentionally overlooked changes to irrelevant details.

If you are making specific assertions about objects instead of checking the whole thing; if you only care about one piece, not the whole — why use snapshots? Might that be time to reset to plain old assertions?

In addition to exploring qualitatively, another leg to walk forward is reasoning quantitatively. Some initial searching found two concepts:

Quality inspection task load which refers to an inverted-U-shaped relationship between task demand and performance level, formally known as Yerkes–Dodson law.

For the following quote from Quality Inspection Task in Modern Manufacturing analogies are from “noise” to irrelevant changes and from “quality of stored representation of standard” to the mental traversal and filtering to interpret a complete snapshot if it has a limited purpose:

This performance decrement may result either from signal-data limits (weak signal in noise), or memory-data limits (quality of stored representation of the standard in delayed comparison memory tasks).

Often associated with learning, cognitive load is explained in a way that applies to this issue in https://www.nngroup.com/articles/minimize-cognitive-load/

Designers should, however, strive to eliminate, or at least minimize, extraneous cognitive load: processing that takes up mental resources, but doesn't actually help users understand the content

User attention is a precious resource, and should be allocated accordingly.

Here are the 3 tips from the article:

Avoid visual clutter [changes that are not relevant to the main purpose of a snapshot]
Build on existing mental models [Current snapshot has a very clear mental model! So it does seem like this discussion has a risk if people don’t have a mental model for snapshot from which irrelevant detail was removed. If not, better leave well enough alone.]
Offload tasks [Jest priority to minimize configuration for the win]

Depending on the timing, applying research from cognitive psychology and UX design to “diff decision fatigue” to determine a reasonable number of changes to evaluate and a tolerable proportion of false negatives/positives could give a target to achieve, or evaluate the reduction to load, or both.

Maybe a win-win for a bright intern to study snapshot testing at Facebook as an A/B test of product teams with and without attention to this issue, as a way to earn an academic credit and have an interesting lightning talk, while providing useful info and having time left over to lend a hand with code :)

I have been using snapshot testing for a few months and I'd like to share my humble opinion.

I tend to write stateless functional components and snapshot testing is perfect for it. However, I got a bit overwhelmed by the amount of diff as you guys mentioned, so my solution was to use shallowToJson provided by https://github.com/adriantoine/enzyme-to-json. I wish jest had an option for shallow rendering out of the box.

Just for reference, it isn't Jest that is missing shallow rendering, it is the react-test-renderer which is currently missing shallow rendering. I agree it makes sense to add this and I believe it shouldn't require a huge effort. Is this something you could potentially contribute to the react project? It'll benefit literally every single person that uses Jest + React.

@cpojer I'd love to contribute back to the project, I just wouldn't know where to start for something like this.

See https://github.com/facebook/react/issues/7148 – I'm sure the React team can help you get started.

@raspo Thank you for bringing shallowToJson into the discussion! Here is a question:

For people who use Enzyme for its traversal API, how much improvement if Enzyme package could include enzyme-to-json so ShallowWrapper would have a .toJSON method and you write:

import { shallow } from 'enzyme';

it('renders the right title', () => {
    const wrapper = shallow(<MyComponent className="strong-class"/>);
    expect(wrapper.find('h3').toJSON()).toMatchSnapshot();
});

Instead of what you write now:

import { shallow } from 'enzyme';
import { shallowToJson } from 'enzyme-to-json';

it('renders the right title', () => {
    const wrapper = shallow(<MyComponent className="strong-class"/>);
    expect(shallowToJson(wrapper.find('h3'))).toMatchSnapshot();
});

@pedrottimark I personally don't make much use of the traversal API (I can cover most of what I care for with snapshots). With that said, I think that would be a great idea! I don't know what the technical implications are though. There must be a reason why enzyme-to-json has its own project instead of being a "simple" PR for enzyme.

Update to https://github.com/facebook/jest/issues/2197#issuecomment-264987874 about shallow snapshots:

The maintainer of enzyme-to-json gave me the link to where Enzyme team already decided not to support a .toJSON method: https://github.com/airbnb/enzyme/pull/541
And this just in via Christoph on Twitter: https://github.com/rogeliog/jest-serializer-enzyme

Of the use cases described above, only diff seems highest priority to explore.

Given prev and next state of data, or component rendered deep or shallow, imagine:

expect(diffSnapshot(prev, next)).toMatchSnapshot();

Here is an experiment based on diff and pretty-format. Can you interpret the state change?

exports[`CheckboxWithLabel changes the text after click (only diff) 1`] = `
"
- checked={false}
+ checked={true}
- Off
+ On
"
`;

If you pretend that the red and green GitHub convention is green and red Jest convention, can you interpret the result as a possible mistake related to internationalization of the component?

 "
 - checked={false}
 + checked={true}
-- Off
-+ On
+- éteint
++ allumé
 "

Or is -- -+ +- ++ from snapshot diff and state change too confusing?

I will dig deeper into various data types and render methods. This exploration does not presume that a diffSnapshot function (naming suggestions welcome :) must become part of Jest API.

Here is another experiment. Does < and > instead of - and + help or hurt your understanding?

exports[`CheckboxWithLabel changes the text after click (only diff) 1`] = `
< checked={false}
> checked={true}
< Off
> On
`;

If you pretend that the red and green GitHub convention is green and red Jest convention, can you interpret the result as a possible mistake related to internationalization of the component?

 < checked={false}
 > checked={true}
-< Off
-> On
+< éteint
+> allumé

P.S. Also experimenting with a snapshot serializer for changes from diff to get rid of:

double quote marks around the entire difference
escaping of double quotes around strings within the content

Negative nit: closing bracket is an irrelevant change that can happen once in a while even with differences. For example, the relevant changed prop1 is last in sort order and you add prop2:

-< prop1="value1-prev" />
-> prop1="value1-next" />
+< prop1="value1-prev"
+> prop1="value1-next"

Major misunderstanding to avoid: if there is a change in a sequence, for example, of todos:

    expect(diffSnapshot(prev, next)).toMatchSnapshot();

exports[`todo marks a todo as completed and diffs the object 1`] = `
< "completed": false,
> "completed": true,
`;

What if the right change occurred in the wrong element?

Here is a more specific claim to get the same snapshot:

    expect(diffSnapshot(prev[index], next[index])).toMatchSnapshot();

The same principle applies to adding a todo. It was added, but where?

What strikes me as the main problem with diff-only snapshots, regardless of the annotation style, is that the message lacks context. This is why I would prefer to use entire diffs or focused assertions.

For example, let's say you have a component that uses the class bg-green in three places. In another state, all those references become bg-pink; and in another they all become bg-red. You take diff snapshots of each state; then you make a mistake that breaks one of the bg-pink expectations. Without surrounding context the word bg-pink in the error message is not helpful in finding the problem.

Yes, context! So a diff snapshot displays hints about what changes are expected?

exports[`CheckboxWithLabel changes attribute and text after click (diff) 1`] = `
  <label>
    <input
<     checked={false}
>     checked={true}
      onChange={[Function]}
      type="checkbox" />
<   Off
>   On
  </label>
`;

To explore that theory, here are 4 examples of test results for relevant versus irrelevant changes:

If you change how CheckboxWithLabel binds its onChange function, you expect no changes.

If you change how CheckboxWithLabel binds its onChange function, but break it, you see changes that are unexpected but relevant to the purpose of this test, so you need to fix the code:

   <label>
     <input
-<     checked={false}
->     checked={true}
+      checked={false}
       onChange={[Function]}
       type="checkbox" />
-<   Off
->   On
+    Off
   </label>

If you change the markup that CheckboxWithLabel renders, does the hint help you decide that these changes are expected and irrelevant to the purpose of this test, so you can update the snapshot:

-  <label>
+  <div>
     <input
 <     checked={false}
 >     checked={true}
+      id="id"
       onChange={[Function]}
       type="checkbox" />
-<   Off
->   On
-  </label>
+    <label
+      for="id">
+<     Off
+>     On
+    </label>
+  </div>

If you change the markup that CheckboxWithLabel renders, but break it, does the hint help you see the unexpected but relevant needle in the expected and irrelevant haystack? You need to fix the code, and then update the snapshot.

-  <label>
+  <div>
     <input
-<     checked={false}
->     checked={true}
+      checked={false}
       onChange={[Function]}
       type="checkbox" />
-<   Off
->   On
-  </label>
+    <label
+      for="id">
+      Off
+    </label>
+  </div>

Or is the last example such a mess it’s evidence that focused assertions are more effective?

@davidtheclark Thank you for hanging in there for this conversation :)

Building step-by-step while responding to comments isn’t stubbornness, but applies advice from Gerald Weinberg to consider at least 3 possibilities before deciding on a change. If this issue reaches a point of diminishing returns and becomes one of the im-possibilities, we have more confidence in whatever we do instead to snapshots. Maybe explaining guidelines and describing patterns more clearly how to use what Jest already has is more helpful than adding features.

To tie up loose ends for the first two use cases:

For a better solution to only text as implied by comments, Enzyme has a .text method:

describe('TodoList', () => {
  it('renders text', () => {
    const state = [todo0, todo1, todo2];
    const onTodoClick = jest.fn();
    const wrapper = mount(
      <TodoList
        onTodoClick={onTodoClick}
        todos={state}
      />
    );
    expect(wrapper.text()).toEqual(state.map(({text}) => text).join(''));
  });
});

For a better solution to without functions see https://github.com/facebook/jest/issues/2202#issuecomment-267073209

To tie up the second loose end with a double knot, I just learned to use Enzyme render method for a snapshot without functions. Compare the first test to the second test.

import renderer from 'react-test-renderer';
import {mount, shallow} from 'enzyme';
import enzymeToJSON from 'enzyme-to-json';

test('render with functions', () => {
  const onClick = jest.fn();
  const { text } = todo0;
  const completed = false;

  expect(renderer.create(
    <Todo onClick={onClick} text={text} completed={completed} />
  ).toJSON()).toMatchSnapshot();
  expect(enzymeToJSON(mount(
    <Todo onClick={onClick} text={text} completed={completed} />
  ))).toMatchSnapshot();
  expect(enzymeToJSON(shallow(
    <Todo onClick={onClick} text={text} completed={completed} />
  ))).toMatchSnapshot();
});

// In React element, `style` prop has object value:

exports[`render with functions 1, 2, 3`] = `
<li
  onClick={[Function]}
  style={
    Object {
      "textDecoration": "none",
    }
  }>
  Install Jest
</li>
`;

import {mount, shallow} from 'enzyme';
import enzymeToJSON from 'enzyme-to-json';

test('render without functions', () => {
  const onClick = jest.fn();
  const { text } = todo0;
  const completed = false;

  expect(enzymeToJSON(mount(
    <Todo onClick={onClick} text={text} completed={completed} />
  ).render())).toMatchSnapshot();
  expect(enzymeToJSON(shallow(
    <Todo onClick={onClick} text={text} completed={completed} />
  ).render())).toMatchSnapshot();
});

// In HTML element, `style` attribute has string value:

exports[`render without functions 1, 2`] = `
<li
  style="text-decoration: none;">
  Install Jest
</li>
`;

For more information, see http://airbnb.io/enzyme/docs/api/render.html

You can omit enzymeToJSON in expect if your Jest 18 configuration has:

"snapshotSerializers": ["enzyme-to-json/serializer"]

Or if your Jest 17 configuration has:

"snapshotSerializers": ["<rootDir>/node_modules/enzyme-to-json/serializer"]

For more information, see https://github.com/adriantoine/enzyme-to-json#serializer

Example of diff snapshots in JSX versus HTML markup to test update to table cell.

Some details omitted, of course. diffSnapshot is a helper that I wrote.

EDIT: My bad, the mountedTestObject helper was an unneeded distraction in the first draft.

describe('Table', () => {
  it('updates a number field: Update of CRUD (JSX/HTML diff)', () => {
    const recordIndex = 1; // any non-zero index
    const fieldIndex = 0; // `when` field is a number
    const field = fieldsReceived[fieldIndex];
    invariant(field.type === 'number', 'testing a number field');

    const store = createStore(reducer);
    store.dispatch(receiveData(fieldsReceived, recordsReceived));
    const wrapper = mount(
      <Provider store={store}>
        <Table />
      </Provider>
    );

    const wrapperCell = wrapper.find('tbody tr').at(recordIndex).find('td').at(fieldIndex);
    const initialJSX = enzymeToJSON(wrapperCell);
    const initialHTML = enzymeToJSON(wrapperCell.render());

    wrapperCell.simulate('doubleClick');
    const updatingJSX = enzymeToJSON(wrapperCell);
    const updatingHTML = enzymeToJSON(wrapperCell.render());

    // increment the value in the cell
    const updatedJSX = enzymeToJSON(wrapperCell);
    const updatedHTML = enzymeToJSON(wrapperCell.render());

    expect(diffSnapshot(initial, updating)).toMatchSnapshot(); // either JSX or HTML
    expect(diffSnapshot(updating, updated)).toMatchSnapshot(); // either JSX or HTML
  });
});

exports[`Table updates a number field: Update of CRUD (JSX diff) 1`] = `
  <td
    data-field-key="when"
    data-record-id={1}>
    2016
>   <form
>     onSubmit={[Function]}>
>     <input
>       autoFocus={true}
>       defaultValue={2016}
>       type="number" />
>   </form>
  </td>
`;

exports[`Table updates a number field: Update of CRUD (JSX diff) 2`] = `
  <td
    data-field-key="when"
    data-record-id={1}>
<   2016
<   <form
<     onSubmit={[Function]}>
<     <input
<       autoFocus={true}
<       defaultValue={2016}
<       type="number" />
<   </form>
>   2017
  </td>
`;

exports[`Table updates a number field: Update of CRUD (HTML diff) 1`] = `
  <td
    data-field-key="when"
    data-record-id="1">
    2016
>   <form>
>     <input
>       type="number"
>       value="2016" />
>   </form>
  </td>
`;

exports[`Table updates a number field: Update of CRUD (HTML diff) 2`] = `
  <td
    data-field-key="when"
    data-record-id="1">
<   2016
<   <form>
<     <input
<       type="number"
<       value="2016" />
<   </form>
>   2017
  </td>
`;

is it just me or wouldn't a diffing app solve a lot of the problems here? The biggest problem seems to be you're scrolling your terminal. That is, unless you want to commit the updated snapshots, and then compare using standard git diffing tools. The downside of that being that most of the time you're going to want to rollback your repository to the point before you abused it to compare snapshots. Maybe I didn't get the memo of how you're supposed to do this before you commit code??

If there currently is no solution, the path forward seems to be to use jest --testResultsProcessor myFormatter.js to capture the results. From there, you need an app (built with electron?) that knows how to diff that formatted result and your current __snapshots__ folder.

Basically a simple jest-specific snapshot diffing application seems like a must for us all. However, if we go down this route, something quite amazing seems to be waiting for us:

If we can reach this point--which ultimately isn't too hard--we can build tools into this snapshot comparison application to filter by component just like the React devtools for chrome. And here's the best part: if you're following best practices and managing to build your app almost entirely out of stateless components in combination with Redux, then all you need to do is to snapshot your ENTIRE app in all the states your app will ever be in. The result is you use the diffing application to filter and drill down to components by name, perhaps filtering by only "red" changed components, etc. The need to stub out child components and constantly concern yourself with how deeply/shallow you render goes away. You're no longer concerned about tests passing or failing--you're able to concern yourself with what you see as "red" and "green" in the diffing editor, even if that means you're always essentially "failing."

Another key ingredient is that you can see the redux state and actions used to create that component tree right next to your components. Ideally, the system is able to determine and display the relevant state to the component you're currently focused on, as well as the reducer code itself, colored in green and red. And to the right of those reducers, is the action creator code in green and red. The net result is you have a breadcrumb trail leading you back from the components to your reducers to your action creators. For example, you may have found a "red" component in the far left column. Then next to it is its relevant reducer (also "red"), and to the right of that is the relevant action creator's code that triggered that specific state change. However it is green, which means the culprit is the reducer. If however, the reducer was also green, you would know it was changes to the component that triggered the change.

Next, you can generate all this with completely automated tests. I.e. You don't need to write any code Here's what you do: you create all the states you need by dispatching all your action creators in all their relevant combinations, and of course you do that in the redux devools and simply export the list of actions. That sets in motion the chain of effects that will result in your entire component tree being rendered in all the states your concerned with. Reducer snapshots are automatically generated along the way, and actions recorded.

HERE'S WHERE IT GETS JUICY: From there I see you being able to expand your state tree, and examine your array of actions (i.e. the actions returned from getActions() if using redux-mock-store) right next to your component tree. And what I really mean is: since every snapshot is a snapshot of your ENTIRE application, you can filter and select a component, and then time-travel by selecting a dispatched action, and look at how that component looks at that point in time while also looking at the state in your redux store! Perhaps after one action, the component is green, but after another action, the component is red because a different code path (added by new code) was followed.

NEXT: we add another notion of time: code commits. So you can drag a slider to navigate through actions dispatched just as you currently can with the redux devtools, but you also can based on commits and see how your component code looks like at a different commit, side by side with the snapshot of the runtime result. From that commit, of course you can "time travel" to actions in its "universe," i.e. all the actions snapshotted in that commit. You're essentially moving in 3D--or whatever you wanna call it--across the history of your snapshots, with lenses for state, actions, component code, reducer code, action creator code, and the ability to filter to just the relevant slices of each as you navigate between associated lenses.

Now, take the example above where you simply made changes to one of your components without involving reducers and action creators. Your goal is to determine where the change came from. Keep in mind: you are no longer wasting time diligently unit testing each component shallowly in isolation, so you don't have the benefit of having a specific test point you in its direction. We can compensate for that with the commit time traveling. First you pinpoint a potentially large "red" area in your component tree, and then you switch to git time traveling mode. The snapshot of the component tree will change as you go back in commits, and the associated component code will as well. Basically the big block of red in your snapshot component tree will be reduced to a like 1 or 2 lines of red in code changes. That's what you'll be looking out for, and you'll be able to scan back a few commits and see exactly where the red was first introduced.

In general, your workflow goes from what you're doing now, which is looking at pass/fails of tests you spent a lot of time optimizing to give you a relevant but not too noisy perspective of your code to quickly analyzing your snapshots in this "3D" time-traveling differ app. You're now a detective armed with all the tools you need to get to core of the problem--and you do so continually as part of your workflow--rather than perhaps over-confidently relying whether your tests pass.

Using snapshots, your tests are going to fail every time. We need to overcome that with a solution built with the new rules of the game in mind. We can do that through a completely automated testing experience that shifts the "work" to an exploratory after the fact experience within this "multi-lens time traveling diffing browser."

Lastly, it's worth mentioning that it's redux and a component tree of almost completely pure functions that makes this all possible. And that's the way you should be developing. You don't even need to 100% accomplish it. I still have a few escape hatches and side-effects for various browser issues--I could still very much rely on this system, and independently write tests for side-effects as needed.

CONCLUSION: So what does this mean for the conversation here? Yea, we need a simpler way to do shallow rendering without using Enzyme. Enzyme should be removed from the picture. We need a simple selector api made specifically for the output of snapshots, etc. Developers should master attaining the optimum "signal to noise" for the depth of component tree rendering and the relationship of components to redux state. I've found the quickest way to test the "happy path" is to always test with redux state, rather than test unconnected components. For components that are leaf components that were never connected but rather received props from connected parents (which is the majority of components), I use the following setup I'm quite happy with:

import React from 'react'
import renderer from 'react-test-renderer'

import {Provider} from 'react-redux'
import createHistory from 'history/createBrowserHistory'
import configureStore from '../src/configureStore'

import {loadSomethingAsync} from '../src/actions/async';

export async function createStore({shouldLoad=true}) {
  const history = createHistory()
  const store = configureStore(history)

  if(shouldLoad) {
    await store.dispatch(loadSomethingAsync())
  }

  return store
}

export default function connect(store, Component) {
  return (mapStateToProps, shouldSnapshot=true) => {
    let wrapper;
    let tree;

    store.subscribe(() => {
      wrapper.forceUpdate();
    });

    let component = renderer.create(
      <Provider store={store}>
        <Wrapper
          ref={wr => wrapper = wr}
          store={store}
          Component={Component}
          mapStateToProps={mapStateToProps}
        />
      </Provider>
    )

    if(shouldSnapshot) {
      tree = snapshot(component);
    }

    return {
      component,
      tree,
    }
  }
}

class Wrapper extends React.Component {
  render() {
    let {store, Component, mapStateToProps} = this.props

    let props = typeof mapStateToProps === 'function'
      ? mapStateToProps(store.getState(), store.dispatch.bind(store))
      : mapStateToProps //object or undefined

    return <Component {...props} />
  }
}

export default function snapshot(component) {
  let tree = component.toJSON()
  expect(tree).toMatchSnapshot()

  return tree
}

import createStore from './createStore'
import connect from './connect'
import snapshot from './snapshot'

export default async function snap(Component, props, ...args) {
  if(isInstance(Component)) {
    return snapshot(Component);
  }

  const store = await createStore(...args)
  const render = connect(store, Component)

  let {component, tree} = render(props)

  return {
    store,
    component,
    tree
  }
}

const isInstance = (Component) => typeof Component === 'object' && Component._component

USAGE:

test('do something', async () => {
  const store = await createStore()
  const mapStateToProps = (state) => {
    let {foo} = state
    return {foo};
  }

  const render = connect(store, MyComponent) // reusable with
  const component = render(mapStateToProps)  // different props

  let tree = component.toJSON()
  expect(tree).toMatchSnapshot()

  store.dispatch({type: 'SOMETHING_HAPPENED'}}) 
  // or through a handler that does the same, eg: tree.props.onClick()

  snap(component) // snapshot reflects state change :)

  // now let's try re-using the enclosed component within
  // our render function but with different mapStateToProps:

  let tree2 = render({foo: 'bar'}); // you can also pass a plain object
  expect(tree2).toMatchSnapshot()
});

SUPER QUICK USAGE using snap():

describe('<MyComponent />', () => {
  let T;

  const mapToProps = (state) => ({foo: state.foo, own: 'bar'})

  it('render component, toJSON, snapshot it', async () => {
    T = await snap(MyComponent, mapToProps)
  })

  it('click element in tree returned by snap above', () => {
    T.tree.children[2].props.onClick()
    snap(T.component) //snap is smart enough to know whether it's dealing with an instance or factory
  })

  it('manually dispatch to store returned by first snap above)', () => {
    T.store.dispatch(someAction())
    snap(T.component) // the component has reactively stayed "alive" throughout
  })

//THE IDEA: you're just constantly snapping like a photographer. All you gotta think 
//about is calling that `snap` function and passing in the obvious arguments. 
//It's all redux-connected and reactive internally, which otherwise jest doesn't 
//give you out of the box like it does with calling plain old setState. In addition, 
//it's snapshotted and expected, etc etc. At a lower level of abstraction it's
//configurable/composable, i.e. with different mapStateToProps for the same 
//component. 

//All that is left to do is choose how many children beneath do 
//you start mocking in order to get the "zoom" level that provides the best 
//signal to noise ratio.

So the conclusion is we're doing too much work, and we do the above for now, but instead of thinking about how to best optimize our "zoom" level of snapshots, we need to take a step back and build a snapshot browser built from the ground-up with these problems in mind. Jest is so good. It's changed the game. These options wouldn't be available to us if it wasn't for Jest, React and Redux. I think we have no choice but to double down and see the whole new world that is available to us and build for that, rather than build for the micro problems of yesterday that will all seize to exist under one fell of something like this. Where do we start? How do we get the electron diffing app up asap? We open source that and get it going, and then we all contribute the various pieces as part of a plugin system. Lets go!

A DISTILLATION OF THE CORE CONCEPTS:

diffing browser that opens every time you run jest, comparing to last committed snapshots
you now only snapshot complete component trees, yay!
no written tests, just exporting redux action sequences, hurray!!
react devtools style filtering of snapshots
source maps mapping snapshots to code
react devtools style association of redux state to components
association of the action that triggered the state
redux devtools style time traveling of your snapshots (and associated "lenses")
git history style time traveling of commits (and associated snapshots + other "lenses")
a "snapshot" is just one "lens" of many, which we can individually build as part of a plugin architecture
complete the new feedback loop: hover over red component in snapshot, right click and mark as pass or fail yourself! (inversion of control!)

Basically to picture this in your head first imagine the react devtools. Then imagine that instead of (or in addition to) props, in the right column you see the relevant redux state for the focused component. Then imagine redux devtools time traveling built right into that interface, and imagine the snapshots, mapped code, etc, changing as you time travel. Then imagine you can time travel your commits in a similar way. All the while associated lenses are changing to reflect the combination of the 2 forms of time traveling + component filtering. And of course everything is color-coded in green and red to indicate what changes your attention should be drawn to. With all of this you should be armed with all the information you need to find culprit code and fix regressions.

I finally had time to read through this entire issue. Thanks for discussing so feverishly.

@faceyspacey: what you are bringing up is very interesting. We specifically didn't do any of this visual integration because we thought that it would be too slow and too annoying for engineers. However, building such a thing on top of Jest would be awesome and I'm sure many people would find it valuable.

@pedrottimark: I'm trying to figure out how we can take everything in this issue and turn it into an action plan to improve snapshot testing and make it more useful. I apologize for not fully digesting this issue when I closed your PR recently. So far the thing that sticks out to me the most is that snapshots can often get too big and that the value of the individual snapshots goes down.

One thing that @rogeliog came up with; which I guess is more about the narrative than how it actually works, is "snapshots over time". (See https://github.com/facebook/jest/blob/master/packages/jest-cli/src/__tests__/__snapshots__/watch-pattern-mode-test.js.snap ). The idea is that we want to snapshot how state evolves over time. @faceyspacey has discovered this, as per his comment above as well as @xixixao who proposed diffs for action changes over time at FB. It seems like everyone is independently coming to the conclusion that we need some form of diffs not just within one snapshot but also snapshots between two or more snapshots. The next question I'd like to find an answer to is whether this has to be a technical/behavior change in how snapshots work and whether we need new APIs or whether this is mostly a visual change in how Jest is being used – basically whether this is a change in how tests are run or how tests are written or both. Once we have an answer to that, we can figure out how to solve this problem.

This is a very interesting discussion and I thought I'd share a solution I recently came up with for my team. I've added a custom matcher to our jest environment that will snapshot test the shallow-rendered output from a react tree or a component instance.

// setupTestFrameworkScriptFile.js

import { Component } from 'react';
import { createRenderer } from 'react-addons-test-utils';

jasmine.addMatchers({
  toMatchShallowRenderSnapshot: () => ({
    compare(expected) {
      let tree;

      if (expected instanceof Component) {
        tree = expected.render();
      } else {
        const renderer = createRenderer();
        renderer.render(expected);
        tree = renderer.getRenderOutput();
      }

      expect(tree).toMatchSnapshot();
      return { pass: true };
    },
  }),
});

An example of what our tests look like:

// MyComponent.test.js

import React from 'react';
import { mount } from 'enzyme';
import MyComponent from './MyComponent';

it('should render with class name', () => {
  expect(<MyComponent className="test" />).toMatchShallowRenderSnapshot();
});

it('should toggle on', () => {
  const wrap = mount(<MyComponent />);
  wrap.find('button').first().simulate('click');
  expect(wrap.instance()).toMatchShallowRenderSnapshot();
});

it('should toggle off', () => {
  const wrap = mount(<MyComponent />);
  wrap.find('button').first().simulate('click');
  wrap.find('button').first().simulate('click');
  expect(wrap.instance()).toMatchShallowRenderSnapshot();
});

@cpojer I just wanted to point out and let the Jest community know that WallabyJS in the meantime has become the perfect snapshot browser, especially since its creator @ArtemGovorov added a feature I requested to show snapshot diffs inline as you browse your tests:

https://github.com/wallabyjs/public/issues/936#issuecomment-271826848

https://twitter.com/wallabyjs/status/819121381002489856

wallaby

Obviously the plan I laid out above is the basis for a company and a massive undertaking, but with Wallaby today you can achieve the foundation of the strategy. Wallaby is the perfect match for Jest. It's a match made in heaven. The combination has take my workflow to a whole new level.

I think the next step for Wallaby is to add an "UPDATE SNAPSHOT" button on all failing snapshot diffs/tests so you can easily update one snapshot at a time (which is a current pain point just with Jest since granular test-matching is time-consuming and since at the very least you almost always end up updating all snapshots in matched files) AND the ability to view snapshots even if they aren't failing. You can already view your snapshots separately by adding them to the list of files Wallaby tracks, but you want to be able to view snapshots on one click of the corresponding test (again, even if it's not failing). If it's failing, you will see the diff on one click, which is the feature added above.

TL;DR: if you're using Jest and doing a lot of snapshotting, there's currently no better way to analyze/review/compare them than with Wallaby.

Closing this discussion with thanks to all commenters. So we can keep open issues < 200 :)

jestjs / jest