Open MichaelRFairhurst opened 6 years ago
People not reading and understanding that Iterable.map
returns a lazy iterator and immediately discarding its return value has come up a number of times on StackOverflow:
Thanks for the examples @jamesderlin! Clearly this is confusing a bunch of folks.
I wonder if a notification that the result of map
was unused (e.g., feedback from the newly added @useResult
annotation) would be a sufficient nudge?
(Technical aside: SDK libraries can't use package:meta
so it's not as easy as that but I do wonder if the idea has any legs.)
/fyi @lrhn @eernstg @natebosch @jakemac53 @munificent (handful of language folks who might have thoughts)
I would be in favor of making a private SDK copy of the annotation which is also understood by the lint for that use case.
Ya I think this would be a good idea, and if we need to hack it with something like Nate said that is worth while. There are probably other things in the SDK which could use this also. For instance Iterable.take
, or really any method that returns an iterable.
On second thought maybe it is worth actually adding a special lint around unused iterables. I can't think of a situation where that could ever be doing what the user intended? Does anybody have a counter example?
The lazy map
is a common pitfall for anyone coming from a language with an eager map
behavior, but it's not the only possible error.
Any unused Iterable
is highly suspicious by itself. So is an unused Stream
. Both are objects which are expected to only do something when you activate them. Creating them and not activating them means the creation was wasted work.
(The common suggestion to do .toList()
on a .map(...)
to trigger the computation, and then ignore the list, is wasteful - but short. I'd suggest to use .last
if it wasn't because we optimize that to not always scan everything. The best alternative is probably .forEach((_){})
. Then you should just use a for(var x in...) {f(x); }
instead of ....map(f)
. If the lint gives a suggestion, that should be it for map
.)
If the lint gives a suggestion, that should be it for
map
.
We aren't restricted to giving a single suggestion, so consider suggesting both. The documentation should explain when to choose one over the other.
I do think this should be a general purpose lint for iterables, maybe just called use_iterables
and probably also use_streams
. But I am not sure if we can provide a general purpose fix in that case? I don't know if we could provide some context aware fixes or something to specialize it for map
.
Is there a compelling reason not to have this be an analyzer warning? (EDIT: which is just to say opted-in by default, in analyzer, and not as an optional lint.)
As for specific fixes, I don't see any reason we couldn't have a family of fixes to handle the common cases in specific ways.
But I am not sure if we can provide a general purpose fix in that case?
I'm not sure why offering to wrap the expression computing the Iterable
in a for
loop wouldn't be a general purpose fix.
I don't know if we could provide some context aware fixes or something to specialize it for map.
I might be confused, but I think @lrhn was talking about the correction message we display whereas you're talking about quick fixes. However, in either case the answer is: yes, we can provide context aware messages and context aware fixes.
I'm not sure why offering to wrap the expression computing the
Iterable
in afor
loop wouldn't be a general purpose fix.
I think the for loop or toList
fixes are fairly specific to map
. There are lots of other ways to get iterables that don't involve a transformation of a source collection.
I don't understand why the way the Iterable
is obtained impacts the set of fixes that are appropriate. If I wrote
Iterable<int> primes(int upperBound) sync* { ... }
void f(List<String> strings) {
primes(24); // [1]
strings.map((s) => s.length); // [2]
}
why would .toList()
or a for loop be any less appropriate on line [1] than on line [2]?
why would
.toList()
or a for loop be any less appropriate on line [1] than on line [2]?
It would be appropriate in both of those cases, but there are ways to get iterables that don't have any side effects, so calling toList()
is not the correct fix. Maybe these are not useful use cases or don't come up in the real world though, but consider:
const things = [1, 2, 3];
void main() {
things.take(1);
things.takeWhile((i) => i < 2);
things.getRange(1, 2);
things.skip(1);
things.skipWhile((i) => i < 2);
}
It is clear that all of these examples are not doing anything useful, but it isn't clear what the user intended.
Maybe it would make sense to include a fix which is to create a local variable and assign the result to it? That is one likely fix for this type of error.
... there are ways to get iterables that don't have any side effects ...
I guess I'm assuming that most of the ways to get an iterable wouldn't have any side effects, but maybe I'm wrong and that's the piece I'm missing.
... so calling toList() is not the correct fix.
I don't think that things.take(1).toList();
is any less appropriate than strings.map(...).toList();
, which is to say it's rarely the right fix for any unused iterable. :-) We might not want to suggest that fix ever.
Maybe it would make sense to include a fix which is to create a local variable and assign the result to it?
Yes, it would.
I do think it would be worthwhile to build a prototype use_iterables
lint and run it over a bunch of code to see if there are any legitimate false positives.
The "correct" fix for something.map(functionWithSideEffect)
is something.forEach(functionWithSideEffect)
(then possibly subject to the "don't use forEach with a function literal" lint).
The correct fix for something.map(pureFunction)
(or any other iterable with no side effect which is not used) is // nothing
.
Since we can't detect whether functions have side effects in general, the recommendation will be heuristic, and I think recommending itrbl.forEach(f)
or for(var x in itrbl) f(x);
for itrbl.map(f)
is a heuristically good suggestion.
For every other unused iterable, I'd probably just tell the user that they're not using it.
I started a POC in #2880. It's currently limited in scope to the common map
case but I wonder, should we extend this to:
Iterable
(take
, takeWhile
, etc?)I'm tempted to try (1) and do some testing but would appreciate some feedback.
Thanks!
I think it would be useful to try just looking for map
like your POC, and also trying ever invocation that produces an iterable. The comparison might prove informative.
Ya I agree it would be useful to compare both options if possible, a general unused iterable lint as well as the map
specific one. The map
case I would expect dominates because it has meanings in other languages that are not the same as dart, so its unexpected.
Some examples, of unintended laziness in the wild:
Iterable<Annotation> get annotations => _annotations ??= element.metadata
.whereNot((m) =>
m.element == null ||
packageGraph.specialClasses[SpecialClass.pragma].element.constructors
.contains(m.element))
.map((m) => Annotation(m, library, packageGraph));
https://github.com/dart-lang/dartdoc/blob/master/lib/src/model/model_element.dart#L391
class FileGlobFilter extends LintFilter {
Iterable<Glob> includes;
Iterable<Glob> excludes;
FileGlobFilter(Iterable<String> includeGlobs, Iterable<String> excludeGlobs)
: includes = includeGlobs.map((glob) => Glob(glob)),
excludes = excludeGlobs.map((glob) => Glob(glob));
@override
bool filter(AnalysisError lint) {
return excludes.any((glob) => glob.matches(lint.source.fullName)) &&
!includes.any((glob) => glob.matches(lint.source.fullName));
}
}
My hunch is that returning or storing lazy iterables is probably not intended in general and that cases where it is, would be well served with an ignore and a comment,
// Really, I do mean to be lazy here.
// ignore: use_iterables
Is there a correctness issue in those examples, they aren't clearly unintentional to me? Granted they are probably wasteful, especially the glob case as it needs to re-parse the globs. But the annotation case seems less clear.
I think we would get a lot of false positives if it was triggered on this type of code.
Is there a correctness issue in those examples, they aren't clearly unintentional to me?
It's a good point. To my eyes, these are pretty wrong and the fact that map
is re-evaluated on every iteration is incorrect in that it's needless and expensive. It's also surprising and I think it's a bit of a gift to maintainers to avoid being surprising... (BTW, the glob one is almost certainly my code. 😬)
As a general rule, it feels to me like you almost never want this behavior and if you do, an ignore is the way to go. Alternatively, something like a @lazy
annotation or something might be interesting...
I think we would get a lot of false positives if it was triggered on this type of code.
And this is for sure a concern and why I'm doing some analysis w/ SDK sources to start (and g3 down the road).
@munificent: I'm curious if you have any gut reactions here (from a readability perspective)
(Needless to say, any and all feedback is very welcome!)
Incidentally, @eernstg opened https://github.com/dart-lang/sdk/issues/47221 proposing a LazyList
which seems like a better way to handle the case where you really do want laziness when you expect zero or one accesses. Also, @leafpetersen points out that there's already a CachingIterable
in the flutter foundation classes (see here).
Recently asked on reddit:
Why does .map() return an Iterable rather than a List?
(Cheers @Hixie and @jakemac53 for the useful responses... 👍)
FWIW: The FileGlobFilter
cleanup this flagged landed in https://github.com/dart-lang/sdk/commit/a1bafa191c1d0f910b24eab59bd4bda75f9682d3
If we want to lint for returning iterables or storing them in a field I think that should be a separate lint - I don't think it meets the bar (in terms of false positives) for something we would be able to add to the recommended lint set, which I do think the general lint around using iterables should be added to. The lint for simply using them in some way should have essentially zero false positives.
If we want to lint for returning iterables or storing them in a field I think that should be a separate lint
+1
I don't think it meets the bar (in terms of false positives) for something we would be able to add to the recommended lint set
This is really useful and makes sense to me.
The lint for simply using them in some way should have essentially zero false positives.
I tend to agree and my initial analysis supports it.
If we run with this idea, it looks like maybe we have two lints:
use_iterables
(or do_not_discard_iterables
or similar)do_not_store_iterables
(or something)Aside: both of these ideas correspond to existing annotations (@useResult
and @doNotStore
respectively) -- not that we can use them in the SDK (or should we for the storage case which could in fact have legitimate exceptions) but it is interesting to see these themes converge...
Much like unnecessary_statements.
I found myself wasting a bunch of time recently where I did things like:
and didn't think about how
map
is lazy and would do nothing in these cases. I thought I simply had empty lists in my tests! Linter could definitely catch this based on MethodElement and the fact that its return value is not used.Seems like a standard enough function that either unnecessary_statements could catch it, or it could be its own rule. Note that unnecessary_statements doesn't catch it now because it assumes all method calls have effects.