belozierov / SwiftCoroutine

Swift coroutines for iOS, macOS and Linux.
https://belozierov.github.io/SwiftCoroutine
MIT License
836 stars 51 forks source link

setjmp/longjmp are Undefined Behavior in Swift #1

Closed calebkleveter closed 4 years ago

calebkleveter commented 4 years ago

I was ecstatic when I saw this repo. Unfortunately, I was informed that the basis of the coroutines defined in this project are unsafe:

setjmp and longjmp aren't safe to use in Swift (or ARC ObjC)—retain counts will leak, and weak refs may corrupt memory. - Joe Groff

Further references:

https://twitter.com/search?q=from%3Ajckarter%20setjmp&src=typed_query

Unless this information is incorrect, there should probably at least be a warning in the README on this.

belozierov commented 4 years ago

Thanks a lot for your comment.

I did some testing, and really in the current 1.0.2 version there is a big problem with leaks caused by setjmp/longjmp which I’ve missed. So thanks for spotting it.

Setjmp/longjmp is actually a tricky API and can cause leak problems if used incautiously, that’s why Apple has disabled it for Swift. But at the same time, it is a convenient OOTB solution for saving and loading CPU registers. However, this API requires very careful usage and following certain rules to avoid the above issues with ARC.

I’ve already found a way to handle setjmp/longjmp leaks. I’ll need to slightly change the existing implementation and in a few days I will release a new update with fixes. So you’ll be able to check it yourself soon.

I really appreciate your feedback.

belozierov commented 4 years ago

In version 1.0.4 a problem with setjmp/longjmp memory leaks was fixed. Please find a test project in the attachment. Here you can check the usage with the leak instrument and look over the memory usage indicators. You can also compare the results with version 1.0.2 and see the difference. Please LMK your feedback in case you'll test this.

Thanks again for raising this issue. SwiftCoroutineMemoryLeakTest.zip

belozierov commented 4 years ago

As to undefined behavior.

It needs to be clarified that setjmp/longjmp has undefined behavior when used with ARC and not with Swift in general. This can also be assumed from the statement by Joe Groff. But the same applies to any other api and assembly code that change the normal function execution and return, including adjusting values of the cpu special registers. This happens because ARC can’t track such things and it can lead to leaks and memory corruption.

But on the other hand, it is impossible to create proper coroutines without such modifications. The suspend/resume feature in coroutine itself implies a change of normal function execution that ARC is unable to track.

Setjmp/longjmp was used in the library to minimize assembly code in the project and also because it is high-level, tested, predictable and is included in the standard C library.

Several tricks were used to avoid ARC disruption when this api is applied. It’s guaranteed that when a coroutine is suspended it will always resume in the same place and only once. This ensures safe routines execution and normal ARC functioning. The only exception is the return from coroutine’s stack with longjmp after its completion. In this case, all objects that could have been captured in the stack by ARC, including weak and unowned references, will be leaked. Therefore, the implementation guarantees that no objects will be captured at the moment of returning from the coroutine. Any api that violates these statements is private and inaccessible to a user.

belozierov commented 4 years ago

I discussed this issue with Joe Groff and here is his reply:

If you're always resuming coroutines, then that should address the issue of leaks. Last I checked, calling setjmp directly from Swift didn't lower it correctly in LLVM (it needs the special returns_twice LLVM attribute so that it doesn't get miscompiled), but if you're calling it from C that shouldn't be a problem.

One way to look at it is that you aren't really using setjmp/longjmp in Swift, but as an implementation detail in order to expose coroutine primitives as Swift API. As long as the implementation maintains the invariants assumed by the Swift runtime environment, then it shouldn't matter how the implementation achieves those goals.

calebkleveter commented 4 years ago

Thanks for looking into this! I've decided to go ahead and add support to NIO's EventLoopFuture type for the async/await syntax, similar to what your CoFuture type has: https://github.com/calebkleveter/NIOAsync

nonameplum commented 4 years ago

@belozierov @calebkleveter Based on this input is it this framework production ready or it will stay rather as a curiosity?

belozierov commented 4 years ago

@nonameplum The version 1.0.7 is stable and has no bugs that I am aware of. I marked it as a pre-release because in the nearest future I plan to make some changes in api.

Currently I’m working on a new version that will include an improved version of CoFuture with much better performance (inspired by EventLoopFuture). Thus, its api will be slightly changed and finalized. This might lead to some migration.

In case of any issues or questions, feel free to contact me.

belozierov commented 4 years ago

@nonameplum To sum up on setjmp/longjmp

Setjmp got some bad fame in early versions of swift because of memory leaks, so it was blocked and is now unavailable in swift. Therefore, the coroutine core was written in C where its usage is native and creates no problems. Joe Groff's answer fully confirms this.

nonameplum commented 4 years ago

@belozierov Thanks a lot for the explanation.