swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.54k stars 10.35k forks source link

Substring violates Hashable’s requirements #59388

Open jepers opened 2 years ago

jepers commented 2 years ago

Describe the bug

Dictionaries and Sets of Substrings containing non-ascii characters can sometimes end up having duplicated keys and/or crash.

Swift Forum post

To Reproduce

  1. Create a file test.swift containing the following program:
    
    func randomString() -> String {
    (0..<10_000).map({ _ in Bool.random() ? "å" : "ä" }).joined()
    }

extension String { func randomSubstring() -> Substring { let i = indices.dropLast(1).randomElement()! return self[i...i] } }

func test() { let s1 = randomString() let s2 = randomString() var dict = [Substring : Int]() for _ in 0 ..< 1000 { let a = s1.randomSubstring() let b = s2.randomSubstring() dict[a, default: 0] += 1 dict[b, default: 0] += 1 } dict.forEach { print($0, $1) } } test()


2. Compile and run the program a couple of times:

% swiftc --version swift-driver version: 1.45.2 Apple Swift version 5.6.1 (swiftlang-5.6.0.323.66 clang-1316.0.20.12) Target: x86_64-apple-macosx12.0 % swiftc -O test.swift && ./test ä 492 å 1002 ä 506 % swiftc -O test.swift && ./test ä 970 å 1030 % swiftc -O test.swift && ./test å 498 ä 479 å 520 ä 503 % swiftc -O test.swift && ./test Fatal error: Duplicate keys of type 'Substring' were found in a Dictionary. This usually means either that the type violates Hashable's requirements, or that members of such a dictionary were mutated after insertion.


3. Note that if the "å" and "ö" Characters are changed to eg "x" and "y", then it will behave as expected (never print duplicate dictionary keys and never crash):

% swiftc -O test.swift && ./test y 1004 x 996 % swiftc -O test.swift && ./test y 1016 x 984 % swiftc -O test.swift && ./test x 993 y 1007

lorentey commented 2 years ago

https://forums.swift.org/t/substring-violates-hashables-requirements-stdlib-compiler-bug/58046/5

Thanks for raising this issue!

Yes, this is unfortunately indeed broken in the Swift 5.6 stdlib -- the fix is present on the Swift 5.7 release branch.

(Note: This fix is not included in the beta OS releases that Apple released earlier this week.)

Currently, the least painful workaround is to refrain from using Substring as a Set/Dictionary key for code that needs to run on a 5.6 stdlib. (I know that's still quite painful. :disappointed:)

https://forums.swift.org/t/substring-violates-hashables-requirements-stdlib-compiler-bug/58046/7

We're thinking about ways to make the fix deploy back to earlier releases.

I'm going to use this issue to track the potential back deployment of the fix. Beyond really bad @_silgen_name tricks, we can also try using @_backDeploy for force-emitting the implementation into clients that deploy to 5.6, while also preserving the ABI entry point.

However, we'll need to figure out if we can reasonably implement this with APIs that are available to call from inlinable functions.

lorentey commented 2 years ago

The actual fix has landed with #58948/#58968, and is (currently) on track to ship in Swift 5.7.

jepers commented 2 years ago

This is still unresolved in 5.7:

% swiftc --version
swift-driver version: 1.62.8 Apple Swift version 5.7 (swiftlang-5.7.0.127.4 clang-1400.0.29.50)
Target: x86_64-apple-macosx12.0

% swiftc -O test.swift && ./test
Fatal error: Duplicate keys of type 'Substring' were found in a Dictionary.
This usually means either that the type violates Hashable's requirements, or
that members of such a dictionary were mutated after insertion.
zsh: illegal hardware instruction  ./test
lorentey commented 2 years ago

@jepers, did you see this on macOS Ventura or Monterey? The fix requires a new stdlib, and that only ships within the OS. (Upgrading the toolchain isn't enough, unfortunately.) I'll double check myself later, too.