INRIA / spoon

Spoon is a metaprogramming library to analyze and transform Java source code. :spoon: is made with :heart:, :beers: and :sparkles:. It parses source files to build a well-designed AST with powerful analysis and transformation API.
http://spoon.gforge.inria.fr/
Other
1.76k stars 352 forks source link

discussion: organization of tests and test resources (a bit untidy) #3677

Open slarse opened 4 years ago

slarse commented 4 years ago

Hello,

I can't find any particular rhyme or reason to the layout of the test resources. Each time I make a PR, I waste some amount of time trying to find a viable resource to use, and failing that, trying to figure out where to put a new test resource. There's something of a division into "noclasspath" test resources, and everything else, but plenty of other resources are actually used in noclasspath tests. Every other test class also seems to have its own getResource-esque method.

This problem hinders me in submitting pull requests, and especially so when I just want to quickly highlight some specific bug I've found. I think this also applies to other potential contributors, especially end users who just want to reproduce an issue they've found. As more tests and resources are added, the problem will just grow.

I suggest we decide on a clear scheme for how to place resources in relation to tests, and define a helper class with the sole purpose of fetching resource paths. All future tests would have to adopt this new scheme, and we can gradually migrate the older tests to it.

It may be helpful to also have a high-level buildModelFromResources method, to which one just passes filepaths and get a model back. There are tons of tests that repeat the small amount of boilerplate required to create a launcher, add a resource to it, and build the model.

Any thoughts on this? It may be that I'm just too much a stickler for rules, but I really find the lack of direction in placing and using test resources to be an unnecessary distraction from actually writing a test case.

MartinWitt commented 4 years ago

There's something of a division into "noclasspath" test resources, and everything else, but plenty of other resources are actually used in noclasspath tests.

There are even some test resources in /src/test/java.

suggest we decide on a clear scheme for how to place resources in relation to tests

I prefer the oldschool same package with 1 subfolder for each feature bundle/class. E.g. a test for SpoonResources.java should be in the path src/test/resources/spoon/compiler/SpoonResources/. In this folder single files and subfolders are allowed if needed. Do you have any scheme in mind?

It may be helpful to also have a high-level buildModelFromResources method, to which one just passes filepaths and get a model back. There are tons of tests that repeat the small amount of boilerplate required to create a launcher, add a resource to it, and build the model.

Yes! We should define test helpers in a single class, because we build the same methods in every test class.

The big problem is the time needed to move all test classes to correct places. It's not really an interesting pull request for any contributor. Enforcing it for new tests only creates a new issue with having tests in 3 places.

slarse commented 4 years ago

There are even some test resources in /src/test/java

Yeah, I know, some tests import the test resources, then access the class name with runtime introspection, and then add it as a resources somewhere.

Do you have any scheme in mind?

I don't have any particular preference, I just want something so I don't have to think about it when I do a PR :). I think that the scheme you suggest is viable, and makes it very easy to figure out where to look for and/or put a new test resource. It may lead to some duplication of resources that are shared between different test classes, but I think that sacrifice is worth it.

The big problem is the time needed to move all test classes to correct places. It's not really an interesting pull request for any contributor.

It's not a glamorous job, but I think it's important for the future of the project. I've had projects of my own almost capsize under the weight of unmaintainable test suites. I could work on it, I enjoy making things more maintainable.

Enforcing it for new tests only creates a new issue with having tests in 3 places.

This I don't actually think is too much of a problem. Right now, test resources can go almost anywhere (and they do). If all future test resources go in predictable places, it's nothing but an improvement in my opinion, even if the existing tests aren't adapted. If also backed up by a good helper class, it could be a quite significant improvement to the experience of adding new tests.

MartinWitt commented 4 years ago

To enhance the discussion i believe the util class for modelbuilding e.g Models should contain at least:

Models.java

For building with a path

public CtModel buildModelFromPath(String path); public CtModel buildModelFromPath(String path, int javaVersion); public CtModel buildModelFromPath(String path, int javaVersion, boolean classPath);

Building with code snippets

public CtModel buildModelFromSnippets(List<String> codesnippets); public CtModel buildModelFromSnippets(List<String> codesnippets, int javaVersion); public CtModel buildModelFromSnippets(List<String> codesnippets, int javaVersion, boolean classPath);

Building with code snippet

public CtModel buildModelFromSnippet(String codesnippet); public CtModel buildModelFromSnippet(String codesnippet, int javaVersion); public CtModel buildModelFromSnippet(String codesnippet, int javaVersion, boolean classPath);

I would skip methods like public CtModel buildModelFromSnippets(String... paths); and instead provide a List.of(....) implementation like jdk9. This could be a good starting point and if more flags are needed we could add more methods later.

Specify scheme

And add somewhere in contribute.md add the scheme. My scheme would look like this:

Sometimes starting a PR helps with the discussion.

monperrus commented 4 years ago

Thanks @slarse and @MartinWitt for the constructive proposals.

buildModelFromSnippet ....

They would fit perfectly in SpoonTestHelpers.

add somewhere in contribute.md add the scheme.

In Spoon, we try to minimize the amount of "soft" documentation and we prefer "hard" automated checks. If we can reasonably implement good rules as CI checks (architectural checks specific to tests), we can give this a try.

slarse commented 4 years ago

@MartinWitt That seems reasonable. Although I'm not entirely sure about the granularity of per-test-class resources. Perhaps per package?

@monperrus I will see if I can cook up an automated check for newly added test code such that it a) uses the test helper to fetch resources, and b) enforces that test resources are fetched from the correct place. It's probably impossible to make it bulletproof, but it can possibly be made somewhat foolproof. We can't do a global check because this would have to be implemented gradually.

slarse commented 4 years ago

So, I've been thinking and looking around the test suite, and it's probably best if we just start out with reducing code duplication by introducing helpers along the lines of those proposed by @MartinWitt . I gathered some metrics on the occurrences of certain repeating patterns in the test suite:

Pattern Occurrences
new Launcher() 690
addInputResource 628
setNoClasspath 188
getEnvironment() 528

So it seems to me like we could significantly reduce repetitive patterns by creating the suggested helpers. I'll get to work on this soon and will try to account for the most common patterns. I'll make a draft PR before putting any serious effort in.

MartinWitt commented 4 years ago

First: Nice work!

setNoClasspath

Cant we remove all these calls, because noClasspath is default? And what could be more interesting. What calls do we do on Environment ? Are there any calls we could include in default helper methods?

slarse commented 4 years ago

setNoClasspath

Cant we remove all these calls, because noClasspath is default?

16 of the calls are actually setNoClasspath(false), so they cannot be removed. I can imagine that the remaining 172 calls that set it to true were either a) put there before it was the default, or b) put there for clarity. I like the clarity of setting noClasspath explicitly in tests that are specifically related to it, as it signifies that the fact that noClasspath is somehow important for the test. I personally try to avoid relying on defaults for tests that would break if the defaults were changed, and aren't intended to actually test the defaults.

monperrus commented 4 years ago

new Launcher(), addInputResource etc. are parts of the core API. So our tests use the same abstractions as our clients which is good. If we refactor them, it would mean we are not happy with the core API and we would have to make it evolve. What are the problems with them?

slarse commented 4 years ago

The problem isn't that the API is used, but that the same patterns are repeated over and over. For example, you'll see something like this all over the place:

// setup for some test
Launcher launcher = new Launcher();
launcher.addInputResource("path/to/SomeClass.java");
launcher.addInputResource("path/to/SomeOtherClass.java");
CtModel model = launcher.buildModel();

With tiny variations in how it's done, in particular how the input resources are added. The tests that do this are in the vast majority of cases _not_testing that part of the API, so it's just noise in the setup for the test. It amounts to thousands of lines of repeated code in the test suite. The proposed solution by @MartinWitt would allow us to replace that with e.g.:

CtModel model = SpoonTestHelpers.buildModelFromPaths("path/to/SomeClass.java", "path/to/SomeOtherClass.java");

Where, of course, SpoonTestHelpers.buildModelFromPaths will use the public API in the same way that the tests are currently doing. Obviously, tests that do test the Launcher class wouldn't use that functionality, but most tests don't test the Launcher, they just need to go through it to run the actual test.

I haven't thought this all the way through, but I'm fairly confident that if we introduce tighter mechanisms for setting up test cases with helpers like this, we can clean up the test suite significantly. We could also easily implement architectural checks on the tests using the test helpers, to make for consistent use of resources.

It's possible that some form of fluent API would be better suited to this task than strictly defined test helper methods, I need to investigate the common patterns in test setup a bit more closely first.

@monperrus With this explanation, does it make sense to you?

monperrus commented 4 years ago

It amounts to thousands of lines of repeated code in the test suite.

I don't see this as a problem per se. It is not a surprise that the core API is used everywhere. It's not a problem of maintenance because those tests do not evolve.

What am I missing?

slarse commented 4 years ago

those tests do not evolve.

Until behavior changes, e.g. due to fixing bugs (look at how many tests break in #3315).

What am I missing?

Needless repetition is bad. We don't typically allow needless repetition in production code, why should tests be any different? Some more concrete things that bother me:

  1. When a test fails, the boilerplate makes it harder to debug.
  2. Writing a new test requires one to write boilerplate, which is annoying in itself. If you care about code style and visual similarity with the rest of the tests, it gets even worse as the boilerplate between tests often differs non-semantically.
  3. If usage of test resources is not centralized to known methods, implementing automatic checks for consistent use of resources is very hard or even impossible.

But of course, we don't need to refactor existing tests if that's not desirable. For me personally, I just want a simple and consistent way of writing tests, with an easy way to build models from resources (that I know where to place or look for), without requiring 4-5 lines of boilerplate when the actual logic of my test is less than that.

But, again, as I noted the start of this issue, it may just be that I'm too pedantic.

monperrus commented 4 years ago

We don't typically allow needless repetition in production code, why should tests be any different?

Because they are of different nature.

That being said we need to ground the discussion on deep thoughts or data. However, I know neither strong essays with analytical arguments nor papers with empirical evidence showing that duplication in test code is bad. If you find some, I'd love to read them!

slarse commented 4 years ago

We don't typically allow needless repetition in production code, why should tests be any different?

Because they are of different nature.

They are definitely different in nature, but in terms of maintainability and readability being important, I don't think they are that different.

That being said we need to ground the discussion on deep thoughts or data. However, I know neither strong essays with analytical arguments nor papers with empirical evidence showing that duplication in test code is bad. If you find some, I'd love to read them!

There's quite a lot of literature on code smells in test. Just from a quick survey of the field, I found Are test smells really harmful? An empirical study. They do find some effect of code duplication in tests:

Test Code Duplication: As can be observed from Fig. 11, this smell has a strong negative impact on participants correctness. In particular, in presence of this smell Fresher students decreased their F-Measure by 27 % (from 87 % to 60 %), on average, Bachelor by 23 % (from 90 % to 62 %), Master by 18 % (from 90 % to 72 %), and Industrial by 29 % (from 100 % to 71 %). Thus, surprisingly, this is the only bad smell for which industrial developers seem to be more impacted than the students. While it is not easy to explain this result, it is possible that students are used to duplicated code spread among projects they worked on during their academic career. All the differences observed in terms of correctness are statistically significant, even if with a small effect size (see Table 12). On the time side, by observing Fig. 11 it is clear that participants saved time on the refactored code. However, the observed differences are not statistically significant except with the Master students and when considering the whole dataset.

slarse commented 4 years ago

I will read up more on this and get back to you, it's entirely reasonable to base a large refactoring on empirical evidence, rather than my personal preference :)

monperrus commented 4 years ago

Cool, looking forward to your wrap-up!

andrewbwogi commented 4 years ago

Not only the resources but the test folders themselves are untidy. Almost all tests go under src/test/java/spoon/test/, organized more or less flat. I think a good idea is @MartinWitt's suggestion to put test cases in the same package as the tested unit and then have integration tests higher up in the hierarchy. If there will be a structural similarity of test and resources folders, maybe rename this issue to a more general topic like Tests are a bit untidy?

monperrus commented 4 years ago

Thanks @andrewbwogi for the follow up. Yes, the tests could be improved and we had a series of such PR in summer 2018.

maybe rename this issue to a more general topic like Tests are a bit untidy?

I'm not sure. A good issue is an issue which is closeable, with a clear, well-defined acceptance criterion to close it.

All baby-step pull-requests on improving tests are welcome, one test (resource) at a time.

slarse commented 4 years ago

@andrewbwogi I agree, it's all a little bit. Untidy.

A note on this is that a large amount of tests are more on the side of integration and system tests, using the public API to test high-level features. They can't necessarily be linked to any given class or even package. There will be no small amount of ambiguity in where to put tests if we organize everything by package and class.