chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 421 forks source link

Organize test/ #5599

Open ben-albrecht opened 7 years ago

ben-albrecht commented 7 years ago

The test/ directory is pretty messy. As of writing, here is what the directory hierarchy looks like on master:

tree -d $CHPL_HOME/test

This is a low priority task, but this issue will allow us to track what is problematic about the current test hierarchy and what can be done to improve it.

For example, an easier first target might be test/modules.

updated to reflect Brad's suggestions

├── library
│   ├── standard
│   │   ├── Assert
│   │   ├── Barrier
│   │   ├── ...
│   ├── packages
│   │   ├── BLAS
│   │   ├── Curl
│   │   ├── ...
├── modules
│   ├── ambiguous
│   ├── scope
│   ├── warnings
│   ├── ...

My preference on this issue is that we should:

bradcray commented 7 years ago

I don't think 'meta' is right... the challenge here is that 'test/modules' is currently serving dual purposes: both testing the language concept of a module (what I think you're calling 'meta', but that sounds way too esoteric for something that's a core component of the language) and the modules that make up the standard library. I think a cleaner organization would be to make a test/library/ directory that contains the standard/ and packages/ directories you've got above and then a test/modules/ directory that tests the language concept.

bradcray commented 7 years ago

(That said, I also want to capture some of the reasons that I generally haven't been supportive of spending the time and energy reorganizing the test/ directory:

). To that end I'd be much more supportive of this effort after we've achieved a greater rate of adoption than now.

lydia-duncan commented 7 years ago

To me, the problem with the current organization is that it makes determining our testing coverage more difficult. If our testing hierarchy were better organized, we could more easily spot areas that aren't currently covered by testing, add testing for those areas, and be more confident that our implementation wasn't hiding dark corners.

If we had better organization, when someone makes a change to an aspect of the language they would be able to briefly look at the tests related to that change and be more confident that a full testing run would demonstrate that they hadn't done anything foolish, or know what areas weren't covered and would need tests written. For a test-driven development style, this would speed up the process.

Better organization would reduce the number of duplicate tests written, saving development time.

lydia-duncan commented 7 years ago

That said, I agree with your points about it:

bradcray commented 7 years ago

If our testing hierarchy were better organized, we could more easily spot areas that aren't currently covered by testing, add testing for those areas, and be more confident that our implementation wasn't hiding dark corners.

I find it hard to believe that imposing a better directory hierarchy would make it simpler to determine what features are insufficiently tested. Wouldn't you need to read through every test in a directory to determine whether or not the feature was covered? This also seems to presuppose that each test tests only one thing and nothing else, such that it will have only one obvious slot in the pachinko tree where it could live; in practice, I think tests often cover multiple language features.

If we had better organization, when someone makes a change to an aspect of the language they would be able to briefly look at the tests related to that change and be more confident that a full testing run would demonstrate that they hadn't done anything foolish, or know what areas weren't covered and would need tests written.

I agree that an optimal test organization would make it easier to test, say, arrays only -- or more compellingly, slices on arrays only without testing everything else about arrays. But this can already be done to some extent today, simply at a coarser level than might be ideal (e.g., all of arrays/ would need to be tested). And, as new tests are being added, I think we've been trying to organize them better than historical ones (or, I have at least). That said, a full parallel test run only takes us about 20 minutes(?) today which makes the barrier for firing off a speculative test run pretty low -- i.e., I find it more worthwhile to do that and work on something else for 20 minutes than to spend even a minute of my time looking for a subset of pre-existing tests that might be most relevant to my mod. Of course, this doesn't help external developers, at least until we have some open solution for testing branches on AWS (or somesuch), which is regrettable. But we also don't have many external developers adding core language features or hindered by lack of test directory organization, that I'm aware of.

Better organization would reduce the number of duplicate tests written, saving development time.

Again, I'm not sure I believe this. The effort to look through tests to see whether a test covering a particular condition exists -- even assuming a better organization -- almost never seems like a better use of time than writing a new one to me.... perhaps when the test is sufficiently complex. I also think there's benefit in unwittingly filing "duplicate" tests because they're almost never identical to pre-existing ones and, as we know, sometimes it doesn't take a very big difference to reveal a weakness in the implementation.

lydia-duncan commented 7 years ago

That said, a full parallel test run only takes us about 20 minutes(?) today which makes the barrier for firing off a speculative test run pretty low -- i.e., I find it more worthwhile to do that and work on something else for 20 minutes than to spend even a minute of my time looking for a subset of pre-existing tests that might be most relevant to my mod.

But given the fact that we definitely don't have testing coverage right now, there's a reasonable possibility that there isn't a test for a particular corner case a developer may have broken. As an example - @ben-albrecht ran into a lot of issues with enums on different than default sizes a week or two ago. He only stumbled across those because he happened to be looking at something related to it and so he added futures for them. If those errors were introduced in the recent past by someone making a mod to how enums are handled (instead of having existed from the implementation of that support), a full paratest would not have picked that up, because we didn't have tests for them. Relying on paratest telling you if you broke something only works when that feature has reasonable unit test coverage, and right now there are some features that don't.

We need better unit test coverage. Having a better test organization will make improving our unit test coverage easier. It would allow some motivated good samaritan to come in, see a test case was missing and add it.

Yes, there's always going to be tests that cover multiple language features. And there should be. But we can't rely on complex tests covering the functionality of a feature for edge cases, we have to have basic cases that isolate individual practices.

It will make tracking down problems easier: if you are working on feature A and you run testing and see that tests B, C, and D are failing, and B is known to be a unit test (either based on the hierarchy of the directory structure or the name of the test), you can start working on fixing B and that will likely reduce the number of tests failing way faster than going through C and D and categorizing their failures first.

ben-albrecht commented 7 years ago

After being bitten by OS X case-insensitivity mixed with inconsistency of test directory capitalization, I'd like to propose that test directories reflect the module names they test, including capitalization, whenever we make this change.

Updating the example above to demonstrate this.

mppf commented 6 years ago

I'd like to see more organization improvement than just the test/modules directory. I frequently wish for top-level names in test/ for language features. E.g. test/errors, test/coercion, test/foralls, test/records.

Yes, we arguably have places to store all of these things, but why is coercions in test/functions? Or why is records in test/types? I think when the # of tests is big enough, such language ideas should get their own directory.

I don't think that minimizing the number of top-level test/* directories is really a goal - it's more making it easy for the next person to correctly file a new test.

ben-albrecht commented 6 years ago

Responding to @bradcray's question elsewhere:

[I] am wondering whether anyone proposed changing $CHPL_HOME/modules to $CHPL_HOME/library (or libraries) or $CHPL_HOME/lib for greater symmetry and to say more about what the modules library does rather than what it contains.

I am all for converging towards more symmetry. Note that we already use $CHPL_HOME/lib for C runtime build files, so using $CHPL_HOME/library may be confusing (and would be awful for tab-completion!)

We could also try to consolidate the internal/, dists/ and layouts tests into test/library to improve symmetry if that's desired. I understand there are good arguments for keeping those separated, though.

This is all keeping in mind that this is a low priority issue, so these additional changes may not be worth making for some time.

mppf commented 6 years ago

If we renamed modules to library, we'd probably want to rename lib as well, since it only stores the built C runtime.