dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.09k stars 1.56k forks source link

Iterator.current throws null errors instead of StateError #45650

Open nex3 opened 3 years ago

nex3 commented 3 years ago

If you call Iterator.current on an non-nullable iterator that's not in a valid location, it will throw a null-safety error rather than the StateError I would expect. This leads to a confusing debugging experience, since as a user I expect the Dart core libraries to be fully null-safe.

For example:

void main() {
  var iterator = <int>[].iterator..moveNext();
  print(iterator.current);
}
lrhn commented 3 years ago

The iterator.current implementation deliberately tries to not add extra overhead for when the type variable is nullable.

That's why it does something as simple as _current ?? null as dynamic instead of trying to deduce whether T is nullable and then throw a special error if it's not. The as dynamic is there instead of as T to allow dart2js to omit the downcast. That's one thing that we can't do with an explicit check for is T in order to throw a StateError. (Probably not the actual code, because the compilers would then apply the downcast to the entire expression, even when _current is non-null).

That's not saying we couldn't do something different (we might risk changing the behavior of dart2js's unsound optimization modes, and probably increase their overhead).

The smallest thing which can throw a StateError would be something like:

T get current {
  var current = _current;
  if (current != null) return current;
  if (current is T) return current;   // Changes `as dynamic` + implicit downcast to `is` check
  throw StateError("No value in iterator");
}

It's not bad, approximately the same number of tests, but less unsafely optimizable for dart2js.

nex3 commented 3 years ago

Doesn't dart2js have an optimization flag that allows it to omit all code paths that are guaranteed to throw subtypes of Error?

rakudrama commented 3 years ago

@nex3 No, dart2js at -O3 omits covariant and dynamic arguments type checks, and at -O4 omits null operand checks on arithmetic (very Dart 1 - they are impossible with sound null safety) and array bounds checks. as checks are currently never omitted, but can be omitted with a @pragma. There has never been an option to omit a throw expression written by the developer.

nex3 commented 3 years ago

The --help documentation for -O3 say that it "Enables optimizations that respect the language semantics only on programs that don't ever throw any subtype of Error." Wouldn't that give you the freedom to optimize out user-authored (or in this case, core-library-authored) code paths that throw errors? I assumed that that's what it did do, and I'm disappointed to hear that it doesn't—it's valuable for authors at every level of the stack to be able to write code that reliably throws useful diagnostic errors in development, but doesn't add code size or runtime overhead at runtime.

sigmundch commented 3 years ago

To clarify, the language in the command line help was written that way to give users a simple guidance without being overly specific. We didn't want to list every kind of error that is currently optimized away, since the list is likely to change over time.

The guidance goal is to help users recognize what ensures that the semantics will not change at this level of optimization (e.g. if you have solid test coverage showing that you never see Error, you can be assured that -O3 will be safe.), but not to promise what the compiler may do with it.

It sounds to me, though, that you want a mechanism to ensure code is always tree-shaken and used only for development. We need to provide better guidance on that. We have also discussed adding mechanisms to annotate code with markers like @debugOnly that can later be leveraged to statically check/warn if code is used in a way that is not guaranteed to be tree-shaken.

rakudrama commented 3 years ago

assert is the supported mechanism for excluding code in production. It does not have to throw an AssertionError, you can jump in front of that by throwing in the 'message' part:

assert(0 <= n && n < this.length, throw RangeError.index(n, this));
nex3 commented 3 years ago

But assert doesn't interact well with the type system, especially in a world of null checks. For example:

int assertNonNull(int? value) {
  assert(value != null, throw ArgumentError("$value must not be null"));
  return value;
}

produces a static error on return value. (That said, @lrhn, an assertion that throws a StateError would certainly be preferable to the status quo.)


I have two broad points with respect to the dart2js message:

  1. The current --help message is misleading. Whatever the original intent, it's very easy to read it as saying "you can throw Errors and have them optimized out in -O3 mode."

  2. The behavior that -O3 seems to describe is highly desirable. Wrapping every check in assert()s is not only awkward and at odds with the type system, it's not the common design pattern when writing Dart code. Existing code by and large throws Error subclasses, especially at library boundaries where intelligible error messages are likely to be important during development.

    The other side of that coin, though, is that if you did implement the semantics I'm describing (maybe as part of the more-aggressive -O4 level), you would automatically get a considerable code size reduction for free across every app. What's more, you'd do so while still working within the guarantees that the optimization levels have always provided: as long as app code is thoroughly tested and never throws an Error at runtime, it will continue to work great in production. If it would help, I can file a separate feature request for this.

songyang-dev commented 4 months ago

I propose making moveNext() and .current return E?. .current will just hold the last execution value of moveNext(), which will be null if moveNext() was never called or has gone passed the end of the iteration.

What do you think?

lrhn commented 4 months ago

We deliberately made current return E instead of E? when we introduced null safety, because the alternative is that each step of the iteration must do an extra null check, exactly the steps where we actually know that the value won't be null. So returning E? is exactly what we didn't want.

That choice was also why we could no longer return null in the out-of-bounds accesses to current, but you were never supposed to call current in those positions, because you should know whether the last call to moveNext() returned true (or was ever called). The official answer is still: Don't read current unless moveNext has just returned true. if you do, the result is unspecified in the general case (it's an Iterable), and each specific iterable may choose to return a value or not.

songyang-dev commented 4 months ago

Ok thanks for the clarification. I think the error would be more helpful if it threw an out of bounds exception instead of a type mismatch.