swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.58k stars 10.36k forks source link

Swift 6 complete concurrency checking permits `async` functions of nonisolated classes to mutate their state #74271

Open benpious opened 5 months ago

benpious commented 5 months ago

Description

This code

class C {

    var a: [Int] = []

    func mutateC() async {
        a.append(a.count)
    }

}

arguably shouldn't be accepted by the compiler in complete concurrency mode because mutateC can seemingly be called on any executor, and mutates C.

The Swift compiler does try to prevent this from occurring, and the code below does not compile:

    func doStuffInParallel() async {
        await withTaskGroup(
            of: Void.self
        ) { group in
            for _ in 0..<1000 {
                group.addTask { [c] in
                    await c.mutateC()
                }
            }
        }
    }

But by simply adding MainActor to doStuffInParallel, it compiles again.


    @MainActor
    func doStuffInParallel() async {
        await withTaskGroup(
            of: Void.self
        ) { group in
            for _ in 0..<1000 {
                group.addTask { @MainActor [c] in
                    await c.mutateC()
                }
            }
        }
    }

TSAN reports a data race, and this code crashes quite reliably on my machine when run.

Note that I've set the language version to Swift 6 and turned on complete concurrency checking; this was not build using Xcode's defaults.

Reproduction

class C {

    var a: [Int] = []

    func mutateC() async {
        a.append(a.count)
    }

}

class Interactor {

    let c = C()

    @MainActor
    func doStuffInParallel() async {
        await withTaskGroup(
            of: Void.self
        ) { group in
            for _ in 0..<1000 {
                group.addTask { @MainActor [c] in
                    await c.mutateC()
                }
            }
        }
    }

}

func start() async {
    let i = Interactor()
    print("start")
    await i.doStuffInParallel()
    print("ended")
    print(i.c.a.count)
}

await start()

Expected behavior

The compiler should not accept this code.

Environment

Xcode 16 beta 1 (16A5171c).

Additional information

No response

hborla commented 5 months ago

The bug is that this is valid

                group.addTask { @MainActor [c] in
                    await c.mutateC()
                }

c is not Sendable and passing c to the generic executor on the async call to mutateC risks concurrent access between the main actor and the generic executor.

It's fine for nonisolated async functions to mutate state on self, because if self it not Sendable, then the guarantee is that no two isolation domains can access it concurrently.

hborla commented 5 months ago

cc @gottesmm, I believe this should be diagnosed by region isolation because a value from the main actor region is being sent outside the main actor.

benpious commented 5 months ago

It's fine for nonisolated async functions to mutate state on self, because if self it not Sendable, then the guarantee is that no two isolation domains can access it concurrently.

Thanks for explaining. I realized after writing this bug report that Swift 6 was able to reject this when I didn't use the capture list, but I didn't understand why.

Fwiw this bug felt particularly pernicious to me because the error messages led me to believe that the compiler was only upset because I captured self, and that by only capturing c I had actually addressed the problem.

Also, in case it's helpful: this version also compiled incorrectly when I was testing earlier this afternoon:

@MainActor
    func doStuffInParallel() async {
       let c = C() // create locally instead of using a capture list
        await withTaskGroup(
            of: Void.self
        ) { group in
            for _ in 0..<1000 {
                group.addTask { @MainActor in
                    await c.mutateC()
                }
            }
        }
    }
gottesmm commented 4 months ago

I just checked with the latest snapshot (7/1):

class C {

    var a: [Int] = []

    func mutateC() async {
        a.append(a.count)
    }

}

class Interactor {

    let c = C()

    @MainActor
    func doStuffInParallel() async {
        await withTaskGroup(
            of: Void.self
        ) { group in
            for _ in 0..<1000 {
                group.addTask { @MainActor [c] in
                    await c.mutateC()
                }
            }
        }
    }

}

func start() async {
    let i = Interactor()
    print("start")
    await i.doStuffInParallel()
    print("ended")
    print(i.c.a.count)
}

await start()
test.swift:33:13: error: sending 'i' risks causing data races
31 |     let i = Interactor()
32 |     print("start")
33 |     await i.doStuffInParallel()
   |             |- error: sending 'i' risks causing data races
   |             `- note: sending main actor-isolated 'i' to main actor-isolated instance method 'doStuffInParallel()' risks causing data races between main actor-isolated and local nonisolated uses
34 |     print("ended")
35 |     print(i.c.a.count)
   |             `- note: access can happen concurrently
36 | }
37 | 

test.swift:21:23: error: main actor-isolated value of type '() async -> Void' passed as a strongly transferred parameter; later accesses could race
19 |         ) { group in
20 |             for _ in 0..<1000 {
21 |                 group.addTask { @MainActor [c] in
   |                       `- error: main actor-isolated value of type '() async -> Void' passed as a strongly transferred parameter; later accesses could race
22 |                     await c.mutateC()
23 |                 }

test.swift:22:29: error: sending 'c' risks causing data races
20 |             for _ in 0..<1000 {
21 |                 group.addTask { @MainActor [c] in
22 |                     await c.mutateC()
   |                             |- error: sending 'c' risks causing data races
   |                             `- note: sending main actor-isolated 'c' to nonisolated instance method 'mutateC()' risks causing data races between nonisolated and main actor-isolated uses
23 |                 }
24 |             }
gottesmm commented 4 months ago

A few things:

  1. The first error is correct but I think the error should say that 'I' is initially non-isolated. But it is correct since doStuffInParallel does main actor isolated 'i' meaning the use later is incorrect.
  2. The second error I believe is incorrect since @MainActor function should be Sendable so we shouldn't emit that error.
  3. With this version of the compiler, we are properly emitting the main actor-isolated error here.

@benpious your thoughts? I think the original issue is fixed (but we can use this to track the others perhaps)

mattmassicotte commented 4 months ago

For 2, while the closure is @MainActor, it also captures c, which is non-isolated and non-Sendable. I'm trying to understand why this could be a safe thing to do...

gottesmm commented 4 months ago

@mattmassicotte c becomes MainActor isolated because it is sent into the closure.

hborla commented 4 months ago

@gottesmm c cannot become main actor isolated because it's a field of self. It should be invalid to transfer it.

EDIT: I'm wrong, because the function that the send happens in is already on the main actor, so self already has to be on the main actor region. There's actually no send happening at the point of capture, right?

benpious commented 4 months ago

@gottesmm Yeah, as long as the data race itself is no longer accepted I would the original issue fixed. Thanks for looking at it!

I'm guessing that when you're asking for my thoughts, this is mostly around whether the errors seem reasonable? I'll just respond to the first two points:

  1. "If the instance is isolated to main, and the function we're calling is isolated to main, how is it possible for there to be 'local nonisolated uses'" would be my thoughts if I had seen this while writing this code.
  2. Yeah my understanding from this is that should already be Sendable in 6. Also, even if it really wasn't Sendable, I wouldn't find this error very useful, because I'd want to know which capture was preventing Sendable from being inferred, or that I should mark it as Sendable explicitly to see that.

Hopefully this was what you were looking for, if not lmk.