Discussion on alternatives to package:test_reflective_loader

cc @lrhn @mit-mit - this is one of the Dart team packages that we had previously discussed as one that should not be published anymore.

I'd like to start the conversation on what we'd need in package:test or otherwise to drop this package. At a low level we want to minimize the surface area of published packages maintained by the Dart team. At a high level we want to avoid highly customized team-specific approaches to writing Dart.

I've taken some of the discussion from https://github.com/dart-lang/sdk/issues/44489 to bootstrap.

@scheglov

We found out early that vanilla package:test is not structured enough for complicated package:analyzer tests.

Can you elaborate on this? Was there a design doc for test_reflective_loader that discussed the increase in structure?

@srawlins

The ability to annotate test methods with @solo or @failingTest is important for our use case, so being able to grab a class's members and examine annotations are the core parts of test_reflective_loader's uses.

How does @solo differ from the solo: argument to test() or group()?

What does @failingTest do? Is it something different from skip:? Does it specifically run the test and expect an exception (such that it would fail if the annotated test passes instead)? We could put a higher priority on https://github.com/dart-lang/test/issues/1178 if those are the semantics you have today.

What other features in package:test do we need to investigate to allow migrating back to package:test from test_reflective_loader?

How does @solo differ from the solo: argument to test() or group()?

I doubt it's different. But solo: was not offered for a long time.

Does it specifically run the test and expect an exception (such that it would fail if the annotated test passes instead)?

Yes. failingTests are always run, and expected to fail. If one passes, that is considered a "test failure", and the engineer says, "Oh look at that, there was an open bug for this all along!"

What other features in package:test do we need to investigate to allow migrating back to package:test from test_reflective_loader?

@scheglov can speak more to the structure idea, but generally our tests are organized into classes, and some inherit from others (including test methods).

I'd like to start the conversation on what we'd need in package:test or otherwise to drop this package. At a low level we want to minimize the surface area of published packages maintained by the Dart team. At a high level we want to avoid highly customized team-specific approaches to writing Dart.

We published test_reflective_loader because we use it for several packages related to package:analyzer. If there is an alternative to publishing it, I'm ready to consider them.

The amount of maintenance for test_reflective_loader was absolutely tiny. IIRC I wrote it in a few hours, and then we spent a few hours on occasions when we wanted to extent it a bit.

Actually, because Dart is object-oriented and class-based, I would argue that it is the right approach to writing tests :-P

Also, I suspect that many large packages invent such highly customized team-specific approach to testing anyway. See also CFE, it has its own flavor - sets of input files, with expectations, and their own status files.

I understand that initial reason why this discussion was started - there is a wish to remove dart:mirrors. And there are clients that use it. And to kill dart:mirrors all its clients must stop using it. Going through generalizing to "we want to avoid highly customized team-specific approaches to writing Dart" while talking only about tests is overkill IMHO. We have even more team-specific conventions about writing the code.

We found out early that vanilla package:test is not structured enough for complicated package:analyzer tests.

Can you elaborate on this? Was there a design doc for test_reflective_loader that discussed the increase in structure?

There was no design doc.

Look at this change for example. It shows our initial issues with vanilla package:test.

We need to perform several operations.
These operations are logically connected, so we group them into a helper class.
Logic for initializing this helper class should be repeated in every test method. Compare: with classes initialization logic is shared in the superclass.
All invocations should go through this helper instance. Compare: with classes all invocations go through the implicit this.
Navigation does not work nested group / test invocations - no outline, no Navigate To Symbol in IntelliJ. Compare: when tests are classes and methods, navigation works as for any other code. We have a lot of tests. Navigating quickly and seeing which tests we have is improving our work flow.

Later we started using mixins to host sets of tests and run them with slightly different configurations - in legacy, and in null safe modes for example. See this file as an example.

@srawlins

The ability to annotate test methods with @solo or @failingTest is important for our use case, so being able to grab a class's members and examine annotations are the core parts of test_reflective_loader's uses.

How does @solo differ from the solo: argument to test() or group()?

@solo is similar to solo:. One exception is that @solo is global, i.e. you can mark a few tests here and there in different classes and files, then run the whole test suite, and only these tests will run. Not sure if solo: works across group()s.

Using @solo is convenient when we do this with mixins, and run the same test in two modes.

I will also note that at some point solo: did not exist.

What does @failingTest do? Is it something different from skip:? Does it specifically run the test and expect an exception (such that it would fail if the annotated test passes instead)? We could put a higher priority on dart-lang/test#1178 if those are the semantics you have today.

See the documentation for @FailingTest. In my opinion this continues the same theme with more structured presentation. Annotations are prominent, not hidden as one of many arguments of test() invocation, and named parameters help to better document the failure - the issue, or just a text.

For the discussion, here's how reflective tests are defined:

@reflectiveTest
class UtilitiesTest  {
  test_parseFile_default_resource_provider() {
    ...
  }

  test_parseFile_errors_noThrow() {
    ...
  }

This naming convention based test definition is similar to how JUnit 3 works.

I can definitely see that reflective tests avoids some syntactic overhead, and allows code sharing through inheritance. (Arguably, having to create a class can also be seen as needles verbosity by other people, but I guess nothing technical prevents having top-level test functions when no state is needed).

Ihe reflective approach prevents some approaches supported by the test package, because the tests depend on the static structure of the program, where the test package depends on the dynamic invocations of group and test (so you can make thousands of parameterized tests in a for-loop). Different, not necessarily universally better or worse.

You would probably be able to transform the existing reflective tests into traditional tests by adding:

static void runTests() { 
  group("Utilities", () {
    test("parseFile default resource provider", UtilitiesTest().test_parseFile_default_resource_provider);
    test("parseFile default errors noThrow", UtilitiesTest().test_parseFile_errors_noThrow);
  });
}

to the UtilitiesTest class and calling that from main.

Definitely more overhead, it's exactly this code that the reflection allows you to avoid writing (and it's something which is notoriously hard to keep up-to-date, with no warning if you forget it when adding a new test). I guess negative tests would have to be wrapped in () => expect(..., throws);, or something more complicated if you want to retain the message.

It's also not using the setUp/tearDown framework for creating the UtilitiesTest instances, which is probably should. (I personally dislike the way setUp/tearDown works because it's almost inevitably based on shared variables and implicit control flow. Your "one instance per test" is objectively superior in that regard. :grin:)

So, this is effectively a different test framework. Can't prevent anyone from creating one, the only issue is who owns and maintains it, and which guarantees we provide for supporting it in the future (up to and including whether we want to keep maintaining dart:mirrors as-is to keep the reflective test package working, or if we are fine with providing a different, probably more static, reflection/meta-programming feature which can be used to a similar effect).

The Dart project as such does not have any need for a second test framework, and most would probably prefer that everybody unifies behind the one recommended test framework. So, we're definitely not going to publish the reflective test package in dart-core.

I don't see any alternatives to reflective_test which will still be class based, but without reflection, unless there is some repetition. It's impossible to declare something (statically) and refer to it dynamically without repeating its name. Without repetition, we'd be into code-generation or other meta-programming, and at that point, we're effectively doing static/source reflection anyway, just without dart:mirrors, or we can just write the static method above and call it "package:test compatible".

The reflective approach prevents some approaches supported by the test package, because the tests depend on the static structure of the program, where the test package depends on the dynamic invocations of group and test (so you can make thousands of parameterized tests in a for-loop).

Actually, this package doesn't prevent that approach because it's built on top of the test package, so there isn't any problem using group or test to generate tests programmatically, which we do in a couple of places.

So, this is effectively a different test framework.

I'd argue that it's an enhancement to the standard test framework in that it uses reflection to dynamically invoke test and group.

The Dart project as such does not have any need for a second test framework ...

I suppose that depends on your definition of "need". The analyzer tests often have enough state that writing them without this support would be considerably more painful.

... and most would probably prefer that everybody unifies behind the one recommended test framework.

I would argue that many subteams have added enhancements to test recommended test framework, but that the analyzer team just happens to be the only ones to have used dart:mirrors to have done so and to have put those enhancements in a published package.

dart-lang / sdk

Discussion on alternatives to package:test_reflective_loader #44537