dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.26k stars 1.58k forks source link

UTF-16 file causes parser to crash #54523

Open Hixie opened 10 months ago

Hixie commented 10 months ago

If you encode the following file as UTF-16, and try to run it with dart test.dart, the Dart parser crashes:

// This file will cause the Dart parser to crash.
class A {
  int b() {
     return testA.end - testA.start - testB.end - testB.start;
  }
}

The following nearly identical file encoded the same way will cause the Dart parser to complain that it cannot decode invalid UTF-8:

// This file cannot be decoded for some reason...
class A {
  int b() {
     return testA.end - testA.start - testB.end - testB.start;
  }
}

The crash is:

Crash when compiling file:///.../test1.dart at character offset 118:
type 'AmbiguousBuilder' is not a subtype of type 'Expression' in type cast

#0      BodyBuilder.handleUnaryPrefixExpression (package:front_end/src/fasta/kernel/body_builder.dart:5785:60)
#1      Parser.parseUnaryExpression (package:_fe_analyzer_shared/src/parser/parser_impl.dart:6365:16)
#2      Parser.parsePrecedenceExpression (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5843:13)
#3      Parser.parseExpression (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5783:15)
#4      Parser.parseExpressionStatement (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5744:13)
#5      Parser.parseExpressionStatementOrDeclarationAfterModifiers (package:_fe_analyzer_shared/src/parser/parser_impl.dart:8115:16)
#6      Parser.parseExpressionStatementOrDeclaration (package:_fe_analyzer_shared/src/parser/parser_impl.dart:7960:12)
#7      Parser.parseStatementX (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5640:14)
#8      Parser.parseStatement (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5535:20)
#9      Parser.parseFunctionBody (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5440:15)
#10     Parser.parseAsyncOptBody (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5265:13)
#11     Parser.parseNamedFunctionRest (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5244:13)
#12     Parser.parseExpressionStatementOrDeclarationAfterModifiers (package:_fe_analyzer_shared/src/parser/parser_impl.dart:8051:16)
#13     Parser.parseStatementX (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5545:14)
#14     Parser.parseStatement (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5535:20)
#15     Parser.parseFunctionBody (package:_fe_analyzer_shared/src/parser/parser_impl.dart:5440:15)
#16     DietListener.buildFunctionBody (package:front_end/src/fasta/source/diet_listener.dart:1240:14)
#17     DietListener.endTopLevelMethod (package:front_end/src/fasta/source/diet_listener.dart:364:5)
#18     Parser.parseTopLevelMethod (package:_fe_analyzer_shared/src/parser/parser_impl.dart:3856:14)
#19     Parser.parseTopLevelMemberImpl (package:_fe_analyzer_shared/src/parser/parser_impl.dart:3596:14)
#20     Parser.parseTopLevelDeclarationImpl (package:_fe_analyzer_shared/src/parser/parser_impl.dart:616:14)
#21     Parser.parseUnit (package:_fe_analyzer_shared/src/parser/parser_impl.dart:411:15)
#22     SourceLoader.buildBody (package:front_end/src/fasta/source/source_loader.dart:1244:12)
<asynchronous suspension>
#23     SourceLoader.buildBodies (package:front_end/src/fasta/source/source_loader.dart:666:7)
<asynchronous suspension>
#24     KernelTarget.buildComponent.<anonymous closure> (package:front_end/src/fasta/kernel/kernel_target.dart:598:7)
<asynchronous suspension>
#25     withCrashReporting (package:front_end/src/fasta/crash.dart:133:12)
<asynchronous suspension>
#26     KernelTarget.buildComponent (package:front_end/src/fasta/kernel/kernel_target.dart:579:12)
<asynchronous suspension>
#27     _buildInternal (package:front_end/src/kernel_generator_impl.dart:210:19)
<asynchronous suspension>
#28     withCrashReporting (package:front_end/src/fasta/crash.dart:133:12)
<asynchronous suspension>
#29     generateKernel.<anonymous closure> (package:front_end/src/kernel_generator_impl.dart:49:12)
<asynchronous suspension>
#30     CompilerContext.clear (package:front_end/src/fasta/compiler_context.dart:139:3)
<asynchronous suspension>
#31     generateKernel (package:front_end/src/kernel_generator_impl.dart:48:10)
<asynchronous suspension>
#32     kernelForModule (package:front_end/src/api_prototype/kernel_generator.dart:100:11)
<asynchronous suspension>
#33     SingleShotCompilerWrapper.compileInternal (file:///b/s/w/ir/x/w/sdk/pkg/vm/bin/kernel_service.dart:419:11)
<asynchronous suspension>
#34     Compiler.compile.<anonymous closure> (file:///b/s/w/ir/x/w/sdk/pkg/vm/bin/kernel_service.dart:225:45)
<asynchronous suspension>
#35     _processLoadRequest (file:///b/s/w/ir/x/w/sdk/pkg/vm/bin/kernel_service.dart:906:37)
<asynchronous suspension>

The error message for the second file makes sense if we only support UTF-8:

test2.dart:1:1: Error: Unable to decode bytes as UTF-8.
��/

The files are identical other than the contents of the comment:

test1.dart
00000000  ff fe 2f 00 2f 00 20 00  54 00 68 00 69 00 73 00  |.././. .T.h.i.s.|
00000010  20 00 66 00 69 00 6c 00  65 00 20 00 77 00 69 00  | .f.i.l.e. .w.i.|
00000020  6c 00 6c 00 20 00 63 00  61 00 75 00 73 00 65 00  |l.l. .c.a.u.s.e.|
00000030  20 00 74 00 68 00 65 00  20 00 44 00 61 00 72 00  | .t.h.e. .D.a.r.|
00000040  74 00 20 00 70 00 61 00  72 00 73 00 65 00 72 00  |t. .p.a.r.s.e.r.|
00000050  20 00 74 00 6f 00 20 00  63 00 72 00 61 00 73 00  | .t.o. .c.r.a.s.|
00000060  68 00 2e 00 0a 00 63 00  6c 00 61 00 73 00 73 00  |h.....c.l.a.s.s.|
00000070  20 00 41 00 20 00 7b 00  0a 00 20 00 20 00 69 00  | .A. .{... . .i.|
00000080  6e 00 74 00 20 00 62 00  28 00 29 00 20 00 7b 00  |n.t. .b.(.). .{.|
00000090  0a 00 20 00 20 00 20 00  20 00 20 00 72 00 65 00  |.. . . . . .r.e.|
000000a0  74 00 75 00 72 00 6e 00  20 00 74 00 65 00 73 00  |t.u.r.n. .t.e.s.|
000000b0  74 00 41 00 2e 00 65 00  6e 00 64 00 20 00 2d 00  |t.A...e.n.d. .-.|
000000c0  20 00 74 00 65 00 73 00  74 00 41 00 2e 00 73 00  | .t.e.s.t.A...s.|
000000d0  74 00 61 00 72 00 74 00  20 00 2d 00 20 00 74 00  |t.a.r.t. .-. .t.|
000000e0  65 00 73 00 74 00 42 00  2e 00 65 00 6e 00 64 00  |e.s.t.B...e.n.d.|
000000f0  20 00 2d 00 20 00 74 00  65 00 73 00 74 00 42 00  | .-. .t.e.s.t.B.|
00000100  2e 00 73 00 74 00 61 00  72 00 74 00 3b 00 0a 00  |..s.t.a.r.t.;...|
00000110  20 00 20 00 7d 00 0a 00  7d 00 0a 00              | . .}...}...|
0000011c

test2.dart
00000000  ff fe 2f 00 2f 00 20 00  54 00 68 00 69 00 73 00  |.././. .T.h.i.s.|
00000010  20 00 66 00 69 00 6c 00  65 00 20 00 63 00 61 00  | .f.i.l.e. .c.a.|
00000020  6e 00 6e 00 6f 00 74 00  20 00 62 00 65 00 20 00  |n.n.o.t. .b.e. .|
00000030  64 00 65 00 63 00 6f 00  64 00 65 00 64 00 20 00  |d.e.c.o.d.e.d. .|
00000040  66 00 6f 00 72 00 20 00  73 00 6f 00 6d 00 65 00  |f.o.r. .s.o.m.e.|
00000050  20 00 72 00 65 00 61 00  73 00 6f 00 6e 00 2e 00  | .r.e.a.s.o.n...|
00000060  2e 00 2e 00 0a 00 63 00  6c 00 61 00 73 00 73 00  |......c.l.a.s.s.|
00000070  20 00 41 00 20 00 7b 00  0a 00 20 00 20 00 69 00  | .A. .{... . .i.|
00000080  6e 00 74 00 20 00 62 00  28 00 29 00 20 00 7b 00  |n.t. .b.(.). .{.|
00000090  0a 00 20 00 20 00 20 00  20 00 20 00 72 00 65 00  |.. . . . . .r.e.|
000000a0  74 00 75 00 72 00 6e 00  20 00 74 00 65 00 73 00  |t.u.r.n. .t.e.s.|
000000b0  74 00 41 00 2e 00 65 00  6e 00 64 00 20 00 2d 00  |t.A...e.n.d. .-.|
000000c0  20 00 74 00 65 00 73 00  74 00 41 00 2e 00 73 00  | .t.e.s.t.A...s.|
000000d0  74 00 61 00 72 00 74 00  20 00 2d 00 20 00 74 00  |t.a.r.t. .-. .t.|
000000e0  65 00 73 00 74 00 42 00  2e 00 65 00 6e 00 64 00  |e.s.t.B...e.n.d.|
000000f0  20 00 2d 00 20 00 74 00  65 00 73 00 74 00 42 00  | .-. .t.e.s.t.B.|
00000100  2e 00 73 00 74 00 61 00  72 00 74 00 3b 00 0a 00  |..s.t.a.r.t.;...|
00000110  20 00 20 00 7d 00 0a 00  7d 00 0a 00              | . .}...}...|
0000011c
parlough commented 10 months ago

Thanks for the details!

Triaging under front-end and cfe-crashes, because a more user friendly/explanatory error message here would be helpful.

As a side note, while the spec doesn't define a specified encoding for the Dart language (https://github.com/dart-lang/language/issues/2186), perhaps we should do so more specifically for the SDK and its tooling on the website?