swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.6k stars 10.37k forks source link

[SR-15713] Global initialization is non-lazy, contrary to documentation and allowing access to uninitialized values #57991

Open mhjacobson opened 2 years ago

mhjacobson commented 2 years ago
Previous ID SR-15713
Radar rdar://84128006
Original Reporter @mhjacobson
Type Bug

Attachment: Download

Environment macOS v10.15 Catalina (19H1417) MacBookAir6,2 Xcode 12.4 (12D4e) ``` java $ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc -v Apple Swift version 5.3.2 (swiftlang-1200.0.45 clang-1200.0.32.28) Target: x86_64-apple-darwin19.6.0 ```
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 0 | |Component/s | Compiler | |Labels | Bug | |Assignee | @etcwilde | |Priority | Medium | md5: 5f959edc50f7b7263045016bb88930f3

Issue Description:

The TSPL book says:

Global constants and variables are always computed lazily, in a similar manner to Lazy Stored Properties. Unlike lazy stored properties, global constants and variables don’t need to be marked with the lazy modifier.

However, at least in some cases, that appears not to be true. Take this simple example. Assume `processAgeInSeconds()` returns the number of seconds since the process was spawned (which on Darwin can be gleaned through the `libproc` APIs):

// global scope

let ageString = String(processAgeInSeconds())
sleepThenLog()

func sleepThenLog() {
    sleep(10)
    print("ageString initialized \(ageString) seconds after launch")
}

Since the value placed into `ageString` (almost always) ends up being less than ten, it's clear that `ageString` is being initialized prior to its first (and only) use inside `sleepThenLog()`. (This can be confirmed more strictly with breakpoints, but I figured this was more accessible.)

Things get even stranger if you place the call to `sleepThenLog()` before the definition of `ageString`, like this:

sleepThenLog()
let ageString = String(processAgeInSeconds())

In this case, not only does the use of `ageString` inside `sleepThenLog()` not initialize `ageString`; the resultant use of an apparently uninitialized `String` then crashes! This makes sense given what I know, but it also seems very unlike Swift to allow access to an uninitialized value.

(lldb) bt
* thread #​1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
    frame #&#8203;0: 0x00007fff69a299e8 libswiftCore.dylib`Swift._StringObject.getSharedUTF8Start() -> Swift.UnsafePointer<Swift.UInt8> + 24
    frame #&#8203;1: 0x00007fff69a29a0e libswiftCore.dylib`Swift._StringObject.sharedUTF8.getter : Swift.UnsafeBufferPointer<Swift.UInt8> + 14
    frame #&#8203;2: 0x00007fff69a22c7a libswiftCore.dylib`Swift._StringGuts.append(Swift._StringGutsSlice) -> () + 970
    frame #&#8203;3: 0x00007fff699bc1e6 libswiftCore.dylib`Swift._StringGuts.append(Swift._StringGuts) -> () + 166
    frame #&#8203;4: 0x00007fff699bbf9a libswiftCore.dylib`Swift.String.write<A where A: Swift.TextOutputStream>(to: inout A) -> () + 26
    frame #&#8203;5: 0x0000000100002add TestUninitialized`sleepThenLog() [inlined] inlined generic function <Swift.DefaultStringInterpolation> of protocol witness for Swift.TextOutputStreamable.write<A where A1: Swift.TextOutputStream>(to: inout A1) -> () in conformance Swift.String : Swift.TextOutputStreamable in Swift at <compiler-generated>:0 [opt]
    frame #&#8203;6: 0x0000000100002ac8 TestUninitialized`sleepThenLog() [inlined] generic specialization <Swift.String> of Swift.DefaultStringInterpolation.appendInterpolation<A where A: Swift.CustomStringConvertible, A: Swift.TextOutputStreamable>(A) -> () at <compiler-generated>:0 [opt]
  * frame #&#8203;7: 0x0000000100002ac8 TestUninitialized`sleepThenLog() at main.swift:13 [opt]
    frame #&#8203;8: 0x00000001000029bc TestUninitialized`main at main.swift:8:1 [opt]
    frame #&#8203;9: 0x00007fff6a28acc9 libdyld.dylib`start + 1

It's possible these are not considered compiler bugs, but if so, then at least it seems like the book should be corrected.

mhjacobson commented 2 years ago

Oh, looks like things behave differently if I don't use `main.swift` and avoid expressions-at-top-level. That's subtle! Is there some documentation on how global variables interact with expressions-at-top-level?

kavon commented 2 years ago

Right, unfortunately there's a difference between "top-level globals", which are globals defined at the top-level where you can also evaluate expressions, and globals in regular contexts, that is, not `main.swift`. The difference can also be seen if you take your single swift file, create a `@main` entry-point, and compile with `-parse-as-library`. That will treat the declarations at the top-level like the "regular" contexts.

Not certain if there are plans to fix this so that special documentation is not required, but @etcwilde might know.

etcwilde commented 2 years ago

Yep, top-level variables are in a weird hybrid state where they live as global declarations, but are initialized sequentially like local variables.

Eventually, I'd like to make them all local variables to the implicit main function that gets generated.
This will ensure memory safety and data-race protection that we don't have today.
It is still a bit subtle if you're thinking of it like a library, but makes more sense if you're treating it like the top-level space in something like Python.
Changing the semantics is source-breaking, so it will have to wait until Swift-6 to become a feature.

Here was the original discussion piece I left on the matter on the forums, if you're interested: https://forums.swift.org/t/on-the-behavior-of-variables-in-top-level-code/52230