Open swift-ci opened 5 years ago
Hm. The error I'm getting is a stack overflow, not an early release. I don't see an obvious issue in the generated code, though. (One problem I could see is if nthll
held on to the original ll
parameter, perhaps for debugging purposes, but looking at the generated LLVM IR that doesn't seem to be happening.)
It's also kind of a problem that releasing deeply nested object graphs can result in overflowing the stack, but that's not a new problem, at least.
Comment by William Gordon Goodsman (JIRA)
@belkadan, I don't think there should be a stack overflow, as it is a lazily deferred list and doesn't really do recursive function calling other than behind the deferred "thunk" closures.
As to when the head of the list is retained, that is what actually works without crashing (if one sets a variable to the head of the lazy list and then passes that to the enumeration routine as an l-value) and it doesn't really build stack memory as new elements should be generated on the heap; while it is true that holding onto the head of the list uses heap memory, that is what is intended as that is how the elements are memoized. This is a frequent use of lazy lists, and in fact is my first application of it, which fortunately is not impacted by this bug.
The problem shows up as a segmentation fault on this Wandbox implementation of the above code, no matter what version of Swift is used from Swift version 3.0 to version 5.0 HEAD
Whether or not there should be a stack overflow, there is a stack overflow. And it's not even 100% incorrect for there to be a stack overflow, because Swift has no way to force a parameter to be released immediately after the last use. Releases are only guaranteed to happen at the end of a scope.
Comment by William Gordon Goodsman (JIRA)
@belkadan, OK, if it's a stack overflow on your machine, two questions:
Why doesn't it show up as a stack overflow on my machine or on wandbox but rather shows up as a segfault?
Why does it take a variable number of loops to trigger? Perhaps that is a clue as to what's causing it?
What I meant is that my code shouldn't trigger a stack overflow as it doesn't call recursively, but I guess the runtime could be building a series of reference count operations which are called recursively, which would be the bug.
As to memory releases, I understood that Swift uses Automatic Reference Counting. Would that not mean that a variable that is continuously reassigned to refer to different heap objects from the head to the tail of the list (as here) would have a the reference count for the head decreased by a code injection and the head deallocated at that time? To cause a stack overflow, these reference count decreases would need to be deferred and built on the stack, which would then cause the stack overflow when there got to be too many of them. Why would the runtime do this? Unless it was to make deallocation time consistent as "all deallocations at the end of the scope"? But why wait?
If it works as I speculate, it doesn't seem a good design and would make consuming of a long lazy list impossible; only de-allocation immediately when reference count goes to zero would work, with the final tail reference consumed when it goes out of scope, which might not happen at the end of the function if it were passed as a return value.
Copy/move sematics?
A segfault is the most common signature of a stack overflow. It presumably takes a variable number of loops because in an optimized build the stack frame used to destroy an LL instance is smaller.
The reference count that doesn't go to 0 is the head of the list, which as I noted is passed as ll
. It's valid until the end of the scope because the destructor could have side effects. (It's also valid past the end of the scope because the current convention is for callers to destroy parameters in most cases rather than callees, but even using the experimental __owned
annotation to override that doesn't make your example work.)
cc @rjmccall
Comment by William Gordon Goodsman (JIRA)
@belkadan, ah, I see. But without move semantics, if the r-value head is passed into the function and not moved to the `ll` value's ownership, it will do as you say and build a stack of not-yet-actionable series of destructors until the end of the function when `ll` goes out of scope. it explains the behavior and the bug. With the current implementation, in order to avoid stack overflow then the "chain of destructors" can't be recursive or queued to the stack, but must build lazily, likely to the heap, in order for this to work.
Even if the destructor chain is built lazily to the heap, it isn't really ideal as that will put quite a lot of overhead in pent up destructors at the end of the function/scope, which isn't the desired behavior.
I guess that maybe this won't be fixed more ideally so that the list is consumed as the owning reference to the head is re-assigned until the refinement about move semantics/ownership is implemented that might get to us about version 6.0 or later. But at least if that plan works out as proposed, it won't affect the desired code but just the effect so it will be just as easy to write code, unlike in Rust where one must always think about the lifetimes and choose when to use reference counting themselves.
Environment
Windows 10 64-bit build 1803 updated as of 30 December 2018. Intel x5-Z8350 CPU with $ Gigabytes of RAM. Running Ubuntu 18.04 LTE under Windows Subsystem for Linux (WSL). Swift version 4.2.1 downloaded from swift.org for Ubuntu 18,04. Compiling test program with swiftc and the bug seems to appear no matter if optimization is turned on or not, but the bug appears at a lower loop count with no optimization at all as noted in the bug report.Additional Detail from JIRA
| | | |------------------|-----------------| |Votes | 0 | |Component/s | Compiler | |Labels | Bug, RunTimeCrash | |Assignee | None | |Priority | Medium | md5: 97062cc48896110f9f7513de9345f8e6Issue Description:
I have successfully written an implementation of a LazyList\<T> class that can be used to memoize a continuous lazily computed linked sequence which may be infinite.
It works fine when one uses a "l-value" to hold onto the head of the linked lazy list that is produced, but when one passes a "r-value" to a consuming function, the compiler appears to release the class memory even though there are still held references to the tails of the list, eventually causing a "Segmentation fault (core dumped)" program crash.
The short code reproducing the fault on my machine is as follows:
The expected result is "The nth lazy list head is 1000000", but when the loops constant is high enough, a Segmentation fault is triggered instead.
In the test file, I have stripped out the code that makes it conform to LazySequenceProtocol, Sequence, and IteratorProtocol and the problem persists.
For loops of a few tens of thousands of consuming list heads there is no problem; the problem is experienced for larger loop sizes, but the actual limit varies: The problem shows up for lower loop counts of about 70,000 on my machine (listed below) without any optimization and at about a loop count of 250,000 with all optimizations turned on (-Ounchecked). However, the actual exact number of loops required to trigger the problem varies slightly from run to run.
My theory that I don't seem to have the means to prove is that the "r-value" argument is being recognized and released on assignment to an interior mutable variable, but that some how when the head is released, it is also triggering the release of the tail while there are still references to it (or rather references that haven't been generated yet), eventually causing attempted use of memory that has not been locked to that reference.
With that theory in mind, I have tried creating other temporary references to the head and tail to try to ensure they don't get released, and have even used Unmanaged to temporarily increase the reference count during the critical section while ownership of the chain is being passed from the head of a given list to its tail, but all to no avail.
Now, I hope that "those in the know" will show that I am doing something wrong, but I really can't see it, and even if I were, there should be some other kind of runtime error other than a Segmentation fault and there shouldn't be a variable number of loops to trigger the problem. Or there should be a documented way (other than the use of a "l-value", which prevents consumption of the list) to ensure the list is consumed cleanly.