swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.3k stars 10.34k forks source link

[SR-9822] String.UnicodeScalarsView.map(_:) crashes on Swift 5 #52239

Open norio-nomura opened 5 years ago

norio-nomura commented 5 years ago
Previous ID SR-9822
Radar None
Original Reporter @norio-nomura
Type Bug
Environment `swift-5.0-DEVELOPMENT-SNAPSHOT-2019-01-29-a`
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 0 | |Component/s | Standard Library | |Labels | Bug, RunTimeCrash | |Assignee | None | |Priority | Medium | md5: 65fada3d4d59ceab3c21324a17955aca

Issue Description:

Reproduced log:

$ pbpaste
// https://emojipedia.org/family-man-woman-girl-boy/
let family = "\u{1f468}\u{200d}\u{1f469}\u{200d}\u{1f467}\u{200d}\u{1f466}"
let utf8Index = family.utf8.index(before: family.utf8.endIndex)
let utf8Before = String(family[..<utf8Index])
_ = utf8Before.unicodeScalars.map { $0 }
$ pbpaste|xcrun --toolchain org.swift.5020190129a swift -
Fatal error: 
Stack dump:
0.  Program arguments: /Library/Developer/Toolchains/swift-5.0-DEVELOPMENT-SNAPSHOT-2019-01-29-a.xctoolchain/usr/bin/swift -frontend -interpret - -enable-objc-interop -sdk /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -color-diagnostics -module-name main 
0  swift                    0x000000010f537688 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1  swift                    0x000000010f536905 llvm::sys::RunSignalHandlers() + 85
2  swift                    0x000000010f537c92 SignalHandler(int) + 258
3  libsystem_platform.dylib 0x00007fff5a830b5d _sigtramp + 29
4  libsystem_platform.dylib 0x0000000100000000 _sigtramp + 2776429760
5  libswiftCore.dylib       0x0000000112c5fd35 $sSKsE9_distance4from2toSi5IndexQz_AEtFSS17UnicodeScalarViewV_Tgq5Tf4nnx_n + 933
6  libswiftCore.dylib       0x0000000112bc122a $sSS17UnicodeScalarViewVSTsST19underestimatedCountSivgTW + 42
7  libswiftCore.dylib       0x0000000112ce2b19 $sSS17UnicodeScalarViewVSlsSl5countSivgTW + 9
8  libswiftCore.dylib       0x0000000112a35fb7 $sSlsE3mapySayqd__Gqd__7ElementQzKXEKlF + 295
9  libswiftCore.dylib       0x00000001130ee200 $sSlsE3mapySayqd__Gqd__7ElementQzKXEKlF + 7046000
10 swift                    0x000000010c0adadd llvm::MCJIT::runFunction(llvm::Function*, llvm::ArrayRef<llvm::GenericValue>) + 461
11 swift                    0x000000010c0b1491 llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, char const* const*) + 1313
12 swift                    0x000000010b97879b swift::RunImmediately(swift::CompilerInstance&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, swift::IRGenOptions&, swift::SILOptions const&) + 3579
13 swift                    0x000000010b952681 performCompile(swift::CompilerInstance&, swift::CompilerInvocation&, llvm::ArrayRef<char const*>, int&, swift::FrontendObserver*, swift::UnifiedStatsReporter*) + 13633
14 swift                    0x000000010b94e0ad swift::performFrontend(llvm::ArrayRef<char const*>, char const*, void*, swift::FrontendObserver*) + 3021
15 swift                    0x000000010b90004e main + 686
16 libdyld.dylib            0x00007fff5a64a3ed start + 1
17 libdyld.dylib            0x000000000000000a start + 2778422302
[1]    10370 done                          pbpaste | 
       10371 illegal hardware instruction  xcrun --toolchain org.swift.5020190129a swift -
norio-nomura commented 5 years ago

It does not on Swift 4.

norio-nomura commented 5 years ago

It crashes when {{String}} is created using range subscript with {{String.Index}} that does not point aligned to {{UnicodeScalar}}'s boundaries.

belkadan commented 5 years ago

I'm not sure this is supposed to be legal to begin with, but if it's not, there should be a proper error message. cc @milseman

bobergj commented 5 years ago

The String documentation suggests it should be legal, although it only mentions subscript, not range subscript:

https://github.com/apple/swift/blob/swift-5.0-branch/stdlib/public/core/String.swift

My bolding:

/// Note that an index into one view may not have an exact corresponding
/// position in another view. For example, the `flag` string declared above
/// comprises a single character, but is composed of eight code units when
/// encoded as UTF-8. The following code creates constants for the first and
/// second positions in the `flag.utf8` view. Accessing the `utf8` view with
/// these indices yields the first and second code UTF-8 units.
///
/// let firstCodeUnit = flag.startIndex
/// let secondCodeUnit = flag.utf8.index(after: firstCodeUnit)
/// // flag.utf8[firstCodeUnit] == 240
/// // flag.utf8[secondCodeUnit] == 159
///
/// When used to access the elements of the `flag` string itself, however, the
/// `secondCodeUnit` index does not correspond to the position of a specific
/// character. Instead of only accessing the specific UTF-8 code unit, that
/// index is treated as the position of the character at the index's encoded
/// offset. In the case of `secondCodeUnit`, that character is still the flag
/// itself.
///
/// // flag[firstCodeUnit] == "\<flag>"
/// // flag[secondCodeUnit] == "\<flag>"

If we compare the behaviour of Swift 4.2 and Swift 5 surrounding this example:

 let flag = "<flag>"
 let firstCodeUnit = flag.startIndex
 let secondCodeUnit = flag.utf8.index(after: firstCodeUnit)
 flag[secondCodeUnit] // Swift 4.2: "<flag>", Swift 5: "<flag>"
 let flag1 = String(flag[secondCodeUnit..<flag.endIndex]) // Swift 4.2: "<flag>", Swift 5: "<prints some octals>"

Note: replaced actual flag character from documentation by \<flag> as it broke JIRA Save.

I am not sure it makes sense that the subscript and range subscript behave differently in this regard.