dart-lang / site-www

Source for Dart website
https://dart.dev
Other
968 stars 700 forks source link

Fix definition of Iterable on 'Iterable collections' page #5618

Open dtonhofer opened 8 months ago

dtonhofer commented 8 months ago

Page URL

https://dart.dev/codelabs/iterables/

Page source

https://github.com/dart-lang/site-www/tree/main/./src/content/codelabs/iterables.md

Describe the problem

We read in What is an Iterable?

You can instead read elements with elementAt(), which steps through the elements of the iterable until it reaches that position.

That would depend on the implementation of the interface.

Expected fix

Suggesting to add "...if the underlying data structure does not allow for a more efficient method", as would be expected.

Because clearly, using elementAt() on a List does not result in linear traversal, but proper indexed access.

The description for elementAt says:

May iterate through the elements in iteration order, ignoring the first index elements and then returning the next. Some iterables may have a more efficient way to find the element.

We can also run a quick test of efficiency regarding accessing a List with elementAt() vs accessing a Set with elementAt():

import 'dart:math';

final random = Random();

void main() {
  const int size = 10000000;
  const int retrievals = 10000;
  final List<int> largeList = List.unmodifiable(buildLargeList(size));
  final Set<int> largeSet = Set.unmodifiable(largeList);
  // Do not penalize the first measurement, do a "hidden access" first
  accessingListWithIndexing(largeList, retrievals);
  // Now measure
  measureTime(
      'accessingListWithIndexing() with size = $size, retrievals = $retrievals',
      () => {accessingListWithIndexing(largeList, retrievals)});
  measureTime(
      'accessingListWithElementAt() with size = $size, retrievals = $retrievals',
      () => {accessingListWithElementAt(largeList, retrievals)});
  measureTime(
      'accessingIterable(), with iterable a List, with size = $size, retrievals = $retrievals',
      () => {accessingIterableWithElementAt(largeList, retrievals, size)});
  measureTime(
      'accessingIterable(), with iterable a Set, with size = $size, retrievals = $retrievals',
      () => {accessingIterableWithElementAt(largeSet, retrievals, size)});
}

void measureTime(String desc, Function func) {
  Stopwatch stopwatch = Stopwatch()..start();
  func();
  print('$desc took ${stopwatch.elapsed}');
}

List<int> buildLargeList(int size) {
  assert(size >= 0);
  final List<int> largeList = [];
  for (int i = 0; i < size; i++) {
    largeList.add(i);
  }
  // Why shuffle the list? It doesn't change anything but
  // let's do it anyway.
  largeList.shuffle(random);
  return largeList;
}

void accessingListWithIndexing(List<int> largeList, final int retrievals) {
  final int size = largeList.length;
  for (int retrievalsToGo = retrievals; retrievalsToGo > 0; retrievalsToGo--) {
    largeList[random.nextInt(size)];
  }
}

void accessingListWithElementAt(List<int> largeList, final int retrievals) {
  final int size = largeList.length;
  for (int retrievalsToGo = retrievals; retrievalsToGo > 0; retrievalsToGo--) {
    largeList.elementAt(random.nextInt(size));
  }
}

void accessingIterableWithElementAt(
    final Iterable<int> largeList, final int retrievals, final int size) {
  for (int retrievalsToGo = retrievals; retrievalsToGo > 0; retrievalsToGo--) {
    largeList.elementAt(random.nextInt(size));
  }
}

Result:

accessingListWithIndexing() with size = 10000000, retrievals = 10000 took 0:00:00.001236
accessingListWithElementAt() with size = 10000000, retrievals = 10000 took 0:00:00.000784
accessingIterable(), with iterable a List, with size = 10000000, retrievals = 10000 took 0:00:00.000763
accessingIterable(), with iterable a Set, with size = 10000000, retrievals = 10000 took 0:01:24.974973

Additional context

No response

I would like to fix this problem.

atsansone commented 6 months ago

@eernstg : Could use your input here. Thanks!

eernstg commented 6 months ago

The reader basically just needs to know that elementAt can be expensive, and if we want better performance it might help to do .toList() and use operator [].

In that situation it won't help much to know that "elementAt might be fast", it really only helps if it makes the reader think "do .toList() and use operator []" ("do .toList() and use elementAt" incurs an extra call so it does actually make sense to say "use operator []").

So perhaps the text should just say that? Something like "Note that elementAt can have complexity which is linear in the size of the iterable (which is very expensive in some cases). If you plan to do it repeatedly then consider doing .toList() on the iterable and use operator []."