new lint: `use_iterables` (was catch places where "map" is used instead of "forEach")

MichaelRFairhurst commented 6 years ago

Much like unnecessary_statements.

I found myself wasting a bunch of time recently where I did things like:

list.map((k) => print(k));
// and
map = {};
list.map((k) => map[k.name] = k);
// and
list.map(recordItem);

and didn't think about how map is lazy and would do nothing in these cases. I thought I simply had empty lists in my tests! Linter could definitely catch this based on MethodElement and the fact that its return value is not used.

Seems like a standard enough function that either unnecessary_statements could catch it, or it could be its own rule. Note that unnecessary_statements doesn't catch it now because it assumes all method calls have effects.

jamesderlin commented 3 years ago

People not reading and understanding that Iterable.map returns a lazy iterator and immediately discarding its return value has come up a number of times on StackOverflow:

pq commented 3 years ago

Thanks for the examples @jamesderlin! Clearly this is confusing a bunch of folks.

I wonder if a notification that the result of map was unused (e.g., feedback from the newly added @useResult annotation) would be a sufficient nudge?

(Technical aside: SDK libraries can't use package:meta so it's not as easy as that but I do wonder if the idea has any legs.)

/fyi @lrhn @eernstg @natebosch @jakemac53 @munificent (handful of language folks who might have thoughts)

natebosch commented 3 years ago

I would be in favor of making a private SDK copy of the annotation which is also understood by the lint for that use case.

jakemac53 commented 3 years ago

Ya I think this would be a good idea, and if we need to hack it with something like Nate said that is worth while. There are probably other things in the SDK which could use this also. For instance Iterable.take, or really any method that returns an iterable.

jakemac53 commented 3 years ago

On second thought maybe it is worth actually adding a special lint around unused iterables. I can't think of a situation where that could ever be doing what the user intended? Does anybody have a counter example?

lrhn commented 3 years ago

The lazy map is a common pitfall for anyone coming from a language with an eager map behavior, but it's not the only possible error.

Any unused Iterable is highly suspicious by itself. So is an unused Stream. Both are objects which are expected to only do something when you activate them. Creating them and not activating them means the creation was wasted work.

(The common suggestion to do .toList() on a .map(...) to trigger the computation, and then ignore the list, is wasteful - but short. I'd suggest to use .last if it wasn't because we optimize that to not always scan everything. The best alternative is probably .forEach((_){}). Then you should just use a for(var x in...) {f(x); } instead of ....map(f). If the lint gives a suggestion, that should be it for map.)

bwilkerson commented 3 years ago

If the lint gives a suggestion, that should be it for map.

We aren't restricted to giving a single suggestion, so consider suggesting both. The documentation should explain when to choose one over the other.

jakemac53 commented 3 years ago

I do think this should be a general purpose lint for iterables, maybe just called use_iterables and probably also use_streams. But I am not sure if we can provide a general purpose fix in that case? I don't know if we could provide some context aware fixes or something to specialize it for map.

pq commented 3 years ago

Is there a compelling reason not to have this be an analyzer warning? (EDIT: which is just to say opted-in by default, in analyzer, and not as an optional lint.)

As for specific fixes, I don't see any reason we couldn't have a family of fixes to handle the common cases in specific ways.

bwilkerson commented 3 years ago

But I am not sure if we can provide a general purpose fix in that case?

I'm not sure why offering to wrap the expression computing the Iterable in a for loop wouldn't be a general purpose fix.

I don't know if we could provide some context aware fixes or something to specialize it for map.

I might be confused, but I think @lrhn was talking about the correction message we display whereas you're talking about quick fixes. However, in either case the answer is: yes, we can provide context aware messages and context aware fixes.

jakemac53 commented 3 years ago

I'm not sure why offering to wrap the expression computing the Iterable in a for loop wouldn't be a general purpose fix.

I think the for loop or toList fixes are fairly specific to map. There are lots of other ways to get iterables that don't involve a transformation of a source collection.

bwilkerson commented 3 years ago

I don't understand why the way the Iterable is obtained impacts the set of fixes that are appropriate. If I wrote

Iterable<int> primes(int upperBound) sync* { ... }

void f(List<String> strings) {
  primes(24); // [1]
  strings.map((s) => s.length); // [2]
}

why would .toList() or a for loop be any less appropriate on line [1] than on line [2]?

jakemac53 commented 3 years ago

why would .toList() or a for loop be any less appropriate on line [1] than on line [2]?

It would be appropriate in both of those cases, but there are ways to get iterables that don't have any side effects, so calling toList() is not the correct fix. Maybe these are not useful use cases or don't come up in the real world though, but consider:

const things = [1, 2, 3];

void main() {
  things.take(1);
  things.takeWhile((i) => i < 2);
  things.getRange(1, 2);
  things.skip(1);
  things.skipWhile((i) => i < 2);
}

It is clear that all of these examples are not doing anything useful, but it isn't clear what the user intended.

Maybe it would make sense to include a fix which is to create a local variable and assign the result to it? That is one likely fix for this type of error.

bwilkerson commented 3 years ago

... there are ways to get iterables that don't have any side effects ...

I guess I'm assuming that most of the ways to get an iterable wouldn't have any side effects, but maybe I'm wrong and that's the piece I'm missing.

... so calling toList() is not the correct fix.

I don't think that things.take(1).toList(); is any less appropriate than strings.map(...).toList();, which is to say it's rarely the right fix for any unused iterable. :-) We might not want to suggest that fix ever.

Maybe it would make sense to include a fix which is to create a local variable and assign the result to it?

Yes, it would.

natebosch commented 3 years ago

I do think it would be worthwhile to build a prototype use_iterables lint and run it over a bunch of code to see if there are any legitimate false positives.

lrhn commented 3 years ago

The "correct" fix for something.map(functionWithSideEffect) is something.forEach(functionWithSideEffect) (then possibly subject to the "don't use forEach with a function literal" lint).

The correct fix for something.map(pureFunction) (or any other iterable with no side effect which is not used) is // nothing.

Since we can't detect whether functions have side effects in general, the recommendation will be heuristic, and I think recommending itrbl.forEach(f) or for(var x in itrbl) f(x); for itrbl.map(f) is a heuristically good suggestion. For every other unused iterable, I'd probably just tell the user that they're not using it.

pq commented 3 years ago

I started a POC in #2880. It's currently limited in scope to the common map case but I wonder, should we extend this to:

any method invocation/property access that leaves an iterable unused? or
just the ones defined on Iterable (take, takeWhile, etc?)

I'm tempted to try (1) and do some testing but would appreciate some feedback.

Thanks!

bwilkerson commented 3 years ago

I think it would be useful to try just looking for map like your POC, and also trying ever invocation that produces an iterable. The comparison might prove informative.

jakemac53 commented 3 years ago

Ya I agree it would be useful to compare both options if possible, a general unused iterable lint as well as the map specific one. The map case I would expect dominates because it has meanings in other languages that are not the same as dart, so its unexpected.

pq commented 3 years ago

Some examples, of unintended laziness in the wild:

  Iterable<Annotation> get annotations => _annotations ??= element.metadata
      .whereNot((m) =>
          m.element == null ||
          packageGraph.specialClasses[SpecialClass.pragma].element.constructors
              .contains(m.element))
      .map((m) => Annotation(m, library, packageGraph));

https://github.com/dart-lang/dartdoc/blob/master/lib/src/model/model_element.dart#L391

class FileGlobFilter extends LintFilter {
  Iterable<Glob> includes;
  Iterable<Glob> excludes;

  FileGlobFilter(Iterable<String> includeGlobs, Iterable<String> excludeGlobs)
      : includes = includeGlobs.map((glob) => Glob(glob)),
        excludes = excludeGlobs.map((glob) => Glob(glob));

  @override
  bool filter(AnalysisError lint) {
    return excludes.any((glob) => glob.matches(lint.source.fullName)) &&
        !includes.any((glob) => glob.matches(lint.source.fullName));
  }
}

https://github.com/dart-lang/sdk/blob/1795c2c8047456b3ee01ea0aafafca2578d6521e/pkg/analyzer/lib/src/lint/linter.dart#L156

My hunch is that returning or storing lazy iterables is probably not intended in general and that cases where it is, would be well served with an ignore and a comment,

// Really, I do mean to be lazy here.
// ignore: use_iterables

jakemac53 commented 3 years ago

Is there a correctness issue in those examples, they aren't clearly unintentional to me? Granted they are probably wasteful, especially the glob case as it needs to re-parse the globs. But the annotation case seems less clear.

I think we would get a lot of false positives if it was triggered on this type of code.

pq commented 3 years ago

Is there a correctness issue in those examples, they aren't clearly unintentional to me?

It's a good point. To my eyes, these are pretty wrong and the fact that map is re-evaluated on every iteration is incorrect in that it's needless and expensive. It's also surprising and I think it's a bit of a gift to maintainers to avoid being surprising... (BTW, the glob one is almost certainly my code. 😬)

As a general rule, it feels to me like you almost never want this behavior and if you do, an ignore is the way to go. Alternatively, something like a @lazy annotation or something might be interesting...

I think we would get a lot of false positives if it was triggered on this type of code.

And this is for sure a concern and why I'm doing some analysis w/ SDK sources to start (and g3 down the road).

@munificent: I'm curious if you have any gut reactions here (from a readability perspective)

(Needless to say, any and all feedback is very welcome!)

pq commented 3 years ago

Incidentally, @eernstg opened https://github.com/dart-lang/sdk/issues/47221 proposing a LazyList which seems like a better way to handle the case where you really do want laziness when you expect zero or one accesses. Also, @leafpetersen points out that there's already a CachingIterable in the flutter foundation classes (see here).

pq commented 3 years ago

Recently asked on reddit:

Why does .map() return an Iterable rather than a List?

https://www.reddit.com/r/dartlang/comments/pfewmi/why_does_map_return_an_iterable_rather_than_a_list/

(Cheers @Hixie and @jakemac53 for the useful responses... 👍)

pq commented 3 years ago

FWIW: The FileGlobFilter cleanup this flagged landed in https://github.com/dart-lang/sdk/commit/a1bafa191c1d0f910b24eab59bd4bda75f9682d3

jakemac53 commented 3 years ago

If we want to lint for returning iterables or storing them in a field I think that should be a separate lint - I don't think it meets the bar (in terms of false positives) for something we would be able to add to the recommended lint set, which I do think the general lint around using iterables should be added to. The lint for simply using them in some way should have essentially zero false positives.

pq commented 3 years ago

If we want to lint for returning iterables or storing them in a field I think that should be a separate lint

+1

I don't think it meets the bar (in terms of false positives) for something we would be able to add to the recommended lint set

This is really useful and makes sense to me.

The lint for simply using them in some way should have essentially zero false positives.

I tend to agree and my initial analysis supports it.

If we run with this idea, it looks like maybe we have two lints:

use_iterables (or do_not_discard_iterables or similar)
do_not_store_iterables (or something)

Aside: both of these ideas correspond to existing annotations (@useResult and @doNotStore respectively) -- not that we can use them in the SDK (or should we for the storage case which could in fact have legitimate exceptions) but it is interesting to see these themes converge...

dart-lang / linter

new lint: `use_iterables` (was catch places where "map" is used instead of "forEach") #855