apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.76k stars 13.5k forks source link

[Proposal] Superset Frontend Testing Guidelines #16792

Closed michael-s-molina closed 2 years ago

michael-s-molina commented 2 years ago

Superset Frontend Testing Guidelines

We feel that tests are an important part of a feature and not an additional or optional effort. That's why we colocate test files with functionality and sometimes write tests upfront to help validate requirements and shape the API of our components. Every new component or file added should have an associated test file with the .test extension.

We use Jest, React Testing Library (RTL), and Cypress to write our unit, integration, and end-to-end tests. For each type, we have a set of best practices/tips described below:

Jest

Write simple, standalone tests

The importance of simplicity is often overlooked in test cases. Clear, dumb code should always be preferred over complex ones. The test cases should be pretty much standalone and should not involve any external logic if not absolutely necessary. That's because you want the corpus of the tests to be easy to read and understandable at first sight.

Avoid nesting when you're testing

Avoid the use of describe blocks in favor of inlined tests. If your tests start to grow and you feel the need to group tests, prefer to break them into multiple test files. Check this awesome article written by Kent C. Dodds about this topic.

No warnings!

Your tests shouldn't trigger warnings. This is really common when testing async functionality. It's really difficult to read test results when we have a bunch of warnings.

Example

You can find an example of a test here

React Testing Library (RTL)

Keep accessibility in mind when writing your tests

One of the most important points of RTL is accessibility and this is also a very important point for us. We should try our best to follow the RTL Priority when querying for elements in our tests. getByTestId is not viewable by the user and should only be used when the element isn't accessible in any other way.

Don't use act unnecessarily

render and fireEvent are already wrapped in act, so wrapping them in act again is a common mistake. Some solutions to the warnings related to act might be found here.

Be specific when using *ByRole

By using the name option we can point to the items by their accessible name. For example:

screen.getByRole('button', { name: /hello world/i })

Using the name property also avoids breaking the tests in the future if other components with the same role are added.

userEvent vs fireEvent

Prefer the user-event library, which provides a more advanced simulation of browser interactions than the built-in fireEvent method.

Usage of waitFor

Example

You can find an example of a test here.

Cypress

Prefer to use Cypress for e2e testing and RTL for integration testing

Here it's important to make the distinction between e2e and integration testing. This article gives an excellent definition:

End-to-end testing verifies that your software works correctly from the beginning to the end of a particular user flow. It replicates expected user behavior and various usage scenarios to ensure that your software works as whole. End-to-end testing uses a production equivalent environment and data to simulate real-world situations and may also involve the integrations your software has with external applications.

A typical software project consists of multiple software units, usually coded by different developers. Integration testing combines those software units logically and tests them as a group Essentially, integration testing verifies whether or not the individual modules or services that make up your application work well together. The purpose of this level of testing is to expose defects in the interaction between these software modules when they are integrated.

Do not use Cypress when RTL can do it better and faster. Many of the Cypress tests that we have right now, fall into the integration testing category and can be ported to RLT which is much faster and gives more immediate feedback. Cypress should be used mainly for end-to-end testing, replicating the user experience, with positive and negative flows.

Isolated and standalone tests

Tests should never rely on other tests to pass. This might be hard when a single user is used for testing as data will be stored in the database. At every new test, we should reset the database.

Cleaning state

Cleaning the state of the application, such as resetting the DB, or in general, any state that might affect consequent tests should always be done in the beforeEach hook and never in the afterEach one as the beforeEach is guaranteed to run, while the test might never reach the point to run the afterEach hook. One example would be if you refresh Cypress in the middle of the test. At this point, you will have built up a partial state in the database, and your clean-up function will never get called. You can read more about it here.

Unnecessary use of cy.wait

Accessibility and Resilience

The same accessibility principles in the RTL section apply here. Use accessible selectors when querying for elements. Those principles also help to isolate selectors from eventual CSS and JS changes and improve the resilience of your tests.

References

geido commented 2 years ago

RTL

https://kentcdodds.com/blog/avoid-nesting-when-youre-testing

+1

Our test files use the test suffix. We should migrate those that use the spec suffix

+1

Priorities

I think priorities should be remarked as it is a common mistake: https://testing-library.com/docs/queries/about/#priority

getByTestId

This is not viewable by the user and should only be used when the element isn't accessible in any other way.

Using act unnecessarily

render and fireEvent are already wrapped in act, so wrapping them in act again is a common mistake. Some solutions to the warning related to act might be found here: https://kentcdodds.com/blog/fix-the-not-wrapped-in-act-warning

Using *ByRole

By using the name option we can point to the items by their accessible name. For example:

screen.getByRole('button', {name: /hello world/i})

Prefer userEvent over fireEvent

The user-event library better represents the actual user actions.

@testing-library/user-event

Usage of waitFor

Materials:

https://kentcdodds.com/blog/common-mistakes-with-react-testing-library https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/Roles

.....to be continued

michael-s-molina commented 2 years ago

These are great points @geido. Thank you!

geido commented 2 years ago

RTL

...continuing on the topic of RTL

Simplicity

I think it's often overlooked how much simplicity is important in test cases. I believe clear, dumb code should always be preferred over complex one. The test cases should be pretty much standalone and should not involve any external logic if not absolutely necessary. That's because you want the corpus of the tests to be easy to read and understandable at first sight.

geido commented 2 years ago

Cypress

I am not much of a Cypress expert and I am taking what I think are the most relevant points for us from the official documentation, as well as adding some thoughts from my times working with it.

Do not use Cypress :)

Do not use Cypress when RTL can do it better and faster. Many of the Cypress tests that we have right now can be ported into RLT that is much faster and gives a more immediate feedback. Cypress should be used mainly for processes from A to B, B to C, etc. replicating the user experience, positive and negative flows.

Isolated and standalone tests

Tests should never rely on other tests to pass. This might be hard when a single user is used for testing as data will be stored into the database. At every new test we should reset the database. This discussion goes a bit out of the scope of the testing guidelines but might be worth pursuing.

Cleaning state

Cleaning the state of the application, such as resetting the db, or in general any state that might affect consequent tests, should always be done in the beforeEach hook and never in the afterEach one as the beforeEach is guaranteed to run, while the test might never reach the point to run the afterEachhook.

Unnecessary use of cy.wait

cy.intercept('GET', '/users', [{ name: 'Maggy' }, { name: 'Joan' }]).as(
  'getUsers'
)
cy.get('#fetch').click()
cy.wait('@getUsers') // <--- wait explicitly for this route to finish
cy.get('table tr').should('have.length', 2)

Resilience

Prefer data-* attributes to isolate selectors from eventual CSS, JS or text content changes. Creating dedicated data selectors specifically for Cypress might increase the boilerplate a bit but comes with the advantage of being super resilient while making it apparent that the code is being tested. For example:

cy.get('[data-cy=submit]').click()

Alternatively using text might work but it is not as resilient. For example:

cy.contains('Submit').click()

Materials:

https://docs.cypress.io/guides/references/best-practices https://www.youtube.com/watch?v=5XQOK0v_YRE

michael-s-molina commented 2 years ago

@geido @eschutho @lyndsiWilliams I compiled the contributions so far, added some more, and mounted the first version in the PR's description.

rusackas commented 2 years ago

One small nit... the philosophy of using data attributes for Cypress is accurate, but thus far we've been preferring data-test instead of data-cy as outlined above.

michael-s-molina commented 2 years ago

One small nit... the philosophy of using data attributes for Cypress is accurate, but thus far we've been preferring data-test instead of data-cy as outlined above.

Thanks for the review. Fixed.

etr2460 commented 2 years ago

Top level comment, do we want to summarize a list of Dos and Don'ts like Evan did for the emotion guidelines at the top? Not sure if that pattern always works

etr2460 commented 2 years ago

One other comment, if you could link an example test file or two to the guidelines (ideally a direct link to the current sha so it never changes) that could be helpful. Sometimes showing works better than telling (and honestly, i love being able to copy a test file as a template for the new one i'm making)

etr2460 commented 2 years ago

Oh sorry, one more thing (but not blocking):

Under cypress you have:

Prefer data-test attributes to isolate selectors from eventual CSS, JS or text content changes.

Isn't that the opposite as what we recommend for RTL? If it's best practices, then I guess it's fine. But maybe we should further call out why we have 2 different recommendations?

michael-s-molina commented 2 years ago

Top level comment, do we want to summarize a list of Dos and Don'ts like Evan did for the emotion guidelines at the top? Not sure if that pattern always works

That was my initial thought also but I preferred to add more context to each DOs and Don'ts because some are not very clear without the context, e.g. "Do not use Cypress". But it's following the Do's and Don't approach: Dos:

Don'ts:

One other comment, if you could link an example test file or two to the guidelines (ideally a direct link to the current sha so it never changes) that could be helpful. Sometimes showing works better than telling (and honestly, i love being able to copy a test file as a template for the new one i'm making)

Great suggestion. Will do.

Isn't that the opposite as what we recommend for RTL? If it's best practices, then I guess it's fine. But maybe we should further call out why we have 2 different recommendations?

Wow! Nice catch. I do agree with you. We can query for elements in Cypress using accessible attributes like role. I was reading the Cypress Best Practices and it seems their main concern is to use selectors that don't change over time. So I think we'll be fine following RTL query Priority and only use the data-test as a last resort.

eschutho commented 2 years ago

This is great @michael-s-molina!

I'm not sold on the "Avoid nesting when you're testing" point. I personally like the abstractions and groupings and dryer code that comes from that long-used pattern. If most other people are really in favor of this new pattern, then that's cool, but it's not my first choice.

eschutho commented 2 years ago

Can we add more explanation to these points so that people reading in the future also understand the why:

Usage of waitFor Do not use it for elements that can be queried with find It should only have one assertion at a time


Can we instead of discouraging people from using Cypress just relay the difference between end-to-end and integration tests? We still need people to continue to write end to end tests. :)


Cleaning state Cleaning the state of the application, such as resetting the DB, or in general, any state that might affect consequent tests should always be done in the beforeEach hook and never in the afterEach one as the beforeEach is guaranteed to run, while the test might never reach the point to run the afterEach hook.

when does the afterEach not run? I think it's better to clean up after yourself than after someone else. :)


Another common antipattern is that people will add classes or ids just for testing. I know we're talking about using roles and test-ids, etc, but do you think we can explicitly discourage adding anything into code for testing except for a test-id?

michael-s-molina commented 2 years ago

I'm not sold on the "Avoid nesting when you're testing" point. I personally like the abstractions and groupings and dryer code that comes from that long-used pattern. If most other people are really in favor of this new pattern, then that's cool, but it's not my first choice.

No problem! I personally prefer to avoid describe because of the points described in Kent C. Dodds article but I'm also fine if the majority chooses to use it.

Can we add more explanation to these points so that people reading in the future also understand the why:

Yes! Will do.

Can we instead of discouraging people from using Cypress just relay the difference between end-to-end and integration tests? We still need people to continue to write end to end tests. :)

I was reading again and you're right! We can make it more clear that we're favoring RTL for integration tests when possible but Cypress remains essential for e2e tests. Nice catch!

when does the afterEach not run? I think it's better to clean up after yourself than after someone else. :)

@geido Can you detail this?

Another common antipattern is that people will add classes or ids just for testing. I know we're talking about using roles and test-ids, etc, but do you think we can explicitly discourage adding anything into code for testing except for a test-id?

I think so. Do you have a sentence in mind to make it more clear? You can reply here or edit the issue's description directly 😉

geido commented 2 years ago

when does the afterEach not run? I think it's better to clean up after yourself than after someone else. :)

@geido Can you detail this?

@michael-s-molina @eschutho The official documentation goes in great length explaining the logic behind this. It is not only the fact that the afterEach might not be reached (for instance, by refreshing Cypress in the middle of the test you would build a partial state) but also to keep the ability to work with dangling state https://docs.cypress.io/guides/references/best-practices#Using-after-or-afterEach-hooks

eschutho commented 2 years ago

Thanks @geido that article makes sense in the context of Cypress and interacting with the test after it has ran. I think it would be helpful to link that article in the docs, too.

michael-s-molina commented 2 years ago

Thanks @geido that article makes sense in the context of Cypress and interacting with the test after it has ran. I think it would be helpful to link that article in the docs, too.

Text updated and link added:

Cleaning state Cleaning the state of the application, such as resetting the DB, or in general, any state that might affect consequent tests should always be done in the beforeEach hook and never in the afterEach one as the beforeEach is guaranteed to run, while the test might never reach the point to run the afterEach hook. One example would be if you refresh Cypress in the middle of the test. At this point, you will have built up a partial state in the database, and your clean-up function will never get called. You can read more about it here.

michael-s-molina commented 2 years ago

Can we add more explanation to these points so that people reading in the future also understand the why:

Done

Can we instead of discouraging people from using Cypress just relay the difference between end-to-end and integration tests? We still need people to continue to write end to end tests. :)

Done

Also added a References section.

I'll send an email to the dev list about this proposal.

eschutho commented 2 years ago

I know this is currently being voted on, but were there other opinions on the no describe nesting recommendation? I still personally like to write my tests like a tree and keep splitting them in to two cases. It's much easier to read that with nesting rather than flat. https://github.com/preset-io/superset/blob/43b92b220f465a08c6322b0ee58aaa0768e8e69[…]/spec/javascripts/dashboard/components/PropertiesModal_spec.jsx

lyndsiWilliams commented 2 years ago

Hi! Sorry for chiming in while this is being voted on, but I think it's important to add that we should make sure not to change the codebase in order to fix a test. The only changes that should be added to code regarding tests should be adding a test-id if absolutely necessary.

michael-s-molina commented 2 years ago

I know this is currently being voted on, but were there other opinions on the no describe nesting recommendation?

It seems people are fine with this. At least nobody else raised concerns so far and it seems this pattern is already being used by newer tests.

michael-s-molina commented 2 years ago

Hi! Sorry for chiming in while this is being voted on, but I think it's important to add that we should make sure not to change the codebase in order to fix a test. The only changes that should be added to code regarding tests should be adding a test-id if absolutely necessary.

When we use TDD we are always changing the codebase in order to fix a test. I'm assuming you're talking about specific types of changes that should be avoided. To make this distinction clearer, can you give an example of a change in our codebase that is currently being made to fix a test?

eschutho commented 2 years ago

I believe @lyndsiWilliams was referring to adding selectors such as ids and classes that are only used for tests. That test-ids are preferable.