Leak Tracker should avoid flagging non-leaks in Flutter tests

matthew-carroll commented 2 months ago

The Leak Tracker currently identifies non-leaks as leaks in Flutter tests for disposable objects.

The following is an example of such a false-positive:

void main() {
  testWidgets("my widget text", (tester) async {
    final focusNode = FocusNode();

    await tester.pumpWidget(
      MaterialApp(...),
    );
  });
}

The Leak Tracker says that we should explicitly call dispose() on focusNode to avoid a leak. However, because the Flutter test runner resets the widget tree for every test, there shouldn't be any need at all to call dispose() within the test. There's no leak.

This same situation applies to things like ScrollController, TextEditingController, etc.

What to do about it

Option 1: Do nothing

The first option is to do nothing. The Leak Tracker will identify situations like the one above as leaks, and developers working on the Flutter framework will be forced to add a teardown for every such object, which calls dispose().

Pro's:

All tests have a recipe for what to do, without any thinking. And adding teardowns shouldn't hurt anything.

Con's:

Unknowing developers will read these teardowns, assume that they're necessary, and draw incorrect conclusions about how the Flutter test runner works. This will lead to a growing number of community members who spread incorrect information about the test runner (imagine Twitter posts telling all developers to always add teardowns with calls to dispose so they don't get memory leaks).
When multiplied across all such tests, Flutter will accumulate something on the order of thousands of unnecessary lines of code.

Option 2: Don't identify these as leaks

The second option is to adjust the Leak Tracker so that it doesn't identify non-leaking disposable within tests as leaks.

The pro's and con's here are the inverse of those listed above.

How to avoid false positives in tests

In the general case, any disposable which is declared within a test() or testWidgets() method, should be safe from memory leaks. When that test ends, the declaration scope is gone, and the widget tree has also been released by the test runner. Therefore, GC is free to collect the disposable, without explicitly calling dispose.

Regex could probably be used to achieve this filtering:

Run the Leak Tracker as-is, and collect the memory leak candidates.
Inspect each memory leak candidate that exists in a test suite. Run regex against each of those test files and check whether the cause of the memory leak is inside a call to testWidgets() or outside. If it's inside, then throw away that leak candidate because it isn't a leak. If it's outside of testWidgets() then it might be a legitimate leak, report it.

polina-c commented 2 months ago

Thanks for opening this discussion and thus creating opportunity to discuss the concerns (not uncommon) in details.

I would like to add couple clarifications.

resets the widget tree for every test, there shouldn't be any need at all to call dispose() within the test.

Method 'dispose' cleans up allocated resources. Some of them may not relate to widget tree. For example, native memory, allocated for images. Such resources are not released after reset of the widget tree.

It relates to the nature of the method 'dispose'. The rule is 'call dispose every time, because it is incapsulated and responsibility of the method dispose' to decide if there are resources to clean up. And it may be that in current version of library there is no resources to cleanup, but next version will update method dispose to make real release.

The bottom line is: even if in this concrete case the method 'dispose' is doing nothing, invoke it, because next versions of involved libraries may change the situation.

any disposable which is declared within a test() or testWidgets() method, should be safe from memory leaks

Leak tracker for tests does not track anything that is created 'outside' of callback of testWidgets. But inside can be direct or indirect, and some of them may flag real leaks of Flutter Framework. Here is examples:

void testWidgets(... {
    final nested = NestedDisposable(); // Creates disposable members, and some of them may be forgotten to be disposed, when `dispose` is invoked, so **true positive**.

    final simple = SimpleDisposable(); // If not disposed, possible leaks are test only, but future versions may start allocating nested disposables, so **false positive** for now, that may become **true positive** in future.

    final testOnly = instantiateTestOnlyClassWithDisposableMembers(); // The members are detected by leak tracker as test-only object, by analyzing call stack, and not flagged, so **true negative**

    pumpWidgets(MyWidget()); // MyWidget creates disposables as part of its lifecycle (for example, in state), where some of them may be forgotten to be disposed even when states are disposed during reset of the widget tree. MyWidget is not disposable, so no need to dispose anything in test, but whatever is created during lifecycle, and forgotten to be disposed, is flagged, so **true positive**.
});

How does it sound?

matthew-carroll commented 2 months ago

You provided some example code, but that looks like hypothetical code. If the Leak Tracker has identified actual leaks from disposables that are declared within tests, can you please show one of those real examples?

polina-c commented 2 months ago

Sure!

This leak of curved animation was caught by a test: https://github.com/flutter/flutter/pull/148373

matthew-carroll commented 2 months ago

Ok, for the conversation record, the curved animation leak bug was located in a framework file, not a test. The leak corresponded to a CurvedAnimation that was created and returned by a framework class, but never disposed.

Therefore, the CurvedAnimation leak bug doesn't qualify as what I've described in this ticket. The CurvedAnimation leak would only have qualified for my description if the CurvedAnimation had been instantiated within the test. But it wasn't.

Can you please show a real leak that was found and fixed in which the leak itself was attached to a variable that was instantiated within a test? Please see my FocusNode as an example.

I've asserted that it should be impossible to create a leak when instantiating a disposable within a test and then not disposing of it. We're looking for a counter-example to that assertion.

polina-c commented 2 months ago

Thanks.

Example of real memory leak: If a disposable allocated native resources, not attached to a widget tree, or instantiate or update (for example, by adding listeners) globals or static variables, it would be a leak. Also it would affect test isolation.

In addition, the method dispose can be buggy, that means it can forget to dispose some disposable members, created during lifecycle. And this should be caught by leak tracker. It will not be, if we opt out these instances.

I do not have concrete examples for such cases, but they are possible and this suggest to just dispose the instances instead of (1) creating and maintaining special logic to opt them out and (2) instructing what is ok to not dispose, because it will violate contracted area of concern for the client of the disposable.

How does it sound?

matthew-carroll commented 2 months ago

I do not have concrete examples for such cases, but they are possible...

Are they though? If there isn't a single such example in the framework's test suite, and if this leak tracker hasn't found a single such concrete example within test code, then maybe it's not possible.

In fact, my original point was exactly that. Or if they are possible, it requires a substantially different scenario from what the leak tracker is complaining about in tests.

this suggest to just dispose the instances instead of (1) creating and maintaining special logic to opt them out and (2) instructing what is ok to not dispose, because it will violate contracted area of concern for the client of the disposable.

This brings us right back to where we started. You're suggesting a blanket policy for all test writing in Flutter as a safeguard for a problem that still doesn't have a single example, and a problem which may actually be impossible due to how the test runner resets the widget tree between tests.

Unless I've missed something, we're still lacking any motivating example for why this leak tracker should impose a policy on all tests that will ever be written.

I also haven't seen how the mis-information associated with that policy will be mitigated.

If you truly cannot find a single concrete example to inform this rule, I would encourage you to adjust the approach.

polina-c commented 2 months ago

Possible example: instance of Image. It reserves native resources.

matthew-carroll commented 2 months ago

Can you please show the code? If there's no code then there's no example. I'd like to see an actual memory leak from within a test method by way of a disposable that's created in the test, just as I indicated in the OP.

polina-c commented 1 month ago

Sorry for not structuring my answers well.

“Unknowing developers will read these teardowns, assume that they're necessary, and draw incorrect conclusions about how the Flutter test runner works.” While my first reaction was to agree with this, after second thinking I realized that the Flutter test runner does not pick up items that are not in the widget tree. Is it critical? In most cases no, because in most cases there are no leaks and if there are, they are test-only. But again, if it is simple to clean, why not to, for the two goals above.

"If there's no code then there's no example. " I disagree. If there is theoretical possibility, but no practical case yet, it does not mean we should not add protection. This protection may help to clean flutter framework code and user code. There is theoretical possibility (no practical case yet) of a leak as result of not disposed object in test, if the object allocates native resources, like image. So, the tool is correct calling out this possibility of leak.

Let me add structure to the discussion.

There are two types of leaks in testing: prod and test-only. Let’s call instances, created directly in the root of the test body, test-root instances. Not disposed disposables create risk of leak. If a leak happens, it is very hard to catch and debug, so we want to avoid risks of prod leaks.

I agree that: Not disposed disposable, created directly in the test body, never mean a prod risk of leak. Test-only risks of leaks are not important to debug unless they cause out of memory failure of a test.

Should we stop caring about test-only leaks at all? I do not think so. The test framework does its best to clean up whatever possible after each test. This serves two goals: (1) memory footprint of the test run and (2) better test isolation. However, the test framework does not clean-up everything. That's why the method ‘tearDown’ exists and developers use it to serve the two goals above.

This is example of issue, caught by leak tracker, where not disposed test-only disposable caused broken test isolation: https://github.com/flutter/flutter/issues/149288

It is very hard to catch if the test isolation is broken. That’s why we have randomization for testing. And there is agreement in the team that test isolation is important. So, the suggestion is: when it is simple to clean, let’s clean for the two goals above.

Can leak tracker detect test-only not disposed and opt out from leak tracking? Yes, it can detect some of them (created in test helpers), and, yes, it misses test-root disposables. Let’s assume we could detect test-root disposables. The problem is that most disposables have disposable children, that are not disposed if parent is not disposed. And if we detect them too and opt out, we will miss all true positive risks of leaks. Because all prod leaky disposables are created as part of lyfe-cycle of test-root instances. And, if it is simple to clean, why not to, for the two goals above.

Thank you for bringing this concern and for pushing me to structure this.

dart-lang / leak_tracker