Closed thestinger closed 8 years ago
This seems doable and straightforward. It's basically adding another bit to track per variable ("was moved") which uses intersection on join when propagating.
However, there is another approach that @pcwalton and I have toyed with from time to time, which is saying that when you move things, the destructor for the moved thing runs "early"---basically at the point where control flow on the non-moved path (if any) rejoins the control flow from the moved path. The idea here was to remove the need to zero 100% of the time. However, I am not sure if that idea (early destructing) is such a good idea, because it seems hard to specify and predict, whereas the current semantics (destructors run at the same time unless moved) is easier.
So if we keep current semantics, then I think something like this seems like a good optimization. In principle we could do better and identify nodes in the CFG where, upon entering the node, we know the value will be moved (this is what liveness will be computing, actually). Then we could avoid zeroing as long as we dominate the node. That might be trickier though given the way failure and unrolling works, so maybe it's better to only optimize the case where the variable declaration is such a node.
http://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic may also be interesting
Nominating for milestone 1, well-defined.
Accepted for well defined. This needs to be documented.
I don't think the alternative would necessarily be harder to specify and predict. It's basically dynamic vs. static transfer of ownership. With the current rule, a variable's destructor is run at the end of the scope it was declared in, unless ownership of it was dynamically transferred to an inner scope. With the alternative rule, referring to the variable in an inner scope statically transfers ownership of it to that inner scope. The compiler already uses the latter rule to determine when a variable is legal to access, so the programmer has to think about it (if not, the compiler will make her). The alternative would make destructors follow the same rules. Currently you can have points in the code where the compiler won't let you access a variable but the variable's destructor may not have been run, which might be unintuitive itself.
Current plan (based on meeting discussion) is to have the compiler issue an error if there are ever two control flows that meet where one does a move and the other does not, so that this issue becomes a moot point. I am in the process of preparing a pull request taking some steps in this direction.
I'm somewhat working on this, btw. Also, it will affect the semantics of what moves are legal, particularly around vectors, since in those cases the compiler will not be able to determine statically which indices have been moved and which have not. I imagine we'll just disallow moves from like let x = vec[y]
and instead offer a method like let x = vec.move(y)
where move
is a fn(self)
method.
Thinking a bit more, there is no need for fn(self)
method. Something like pop
etc will suffice, particularly combined with swap
:
impl<T> ~[T] {
fn remove(&mut self, index: uint) -> T {
assert!(index < self.len());
if (index != self.len() - 1) {
self.swap(index, self.len() - 1);
}
self.pop();
}
}
FWIW, that method currently exists: .swap_remove
.
@huonw that's right, I knew I'd seen this somewhere :)
cc me
This interacts with the meaning/utility of the std::ptr::read_and_zero_ptr
function, right?
I've been playing with that function recently in attempts to prototype some hypothetical finalization APIs. So I am curious to know what would replace it.
If a struct currently carries a ~T
, I believe I can use read_and_zero_ptr
to zero out that reference and keep that destructor for ~T
(and thus for T
itself) from running.
What is the replacement for that?
(Or should the struct field use Option<~T>
instead of ~T
and thus I would be able to swap in None
to prevent a destructor for ~T
from running when the struct is destroyed?)
@pnkfelix: I haven't had any issues with avoiding a reliance on the drop flag for various smart pointers and low-level data structures.
You can store *mut T
instead of ~T
as the Rc<T>
and Weak<T>
implementations already do. They'll need to be fully migrated from ~T
when they gain allocator support anyway. They can either do the memory allocation directly or use a Uniq<T, A = Heap>
type.
Here's an example of a ring buffer leaving an uninitialized gap in the underlying vector:
https://github.com/thestinger/rust-core/blob/master/core/deque.rs
The trick is just preventing the underlying destructor from running. At the moment it leaves the vector length at zero, but if it did reuse the field for the container length then it would just call set_len
at the end. It could transmute the vector to another type, but it's easy enough to prevent it from destroying the contained values without that.
I do know one place where the drop flag results in a slight increase in performance. A priority queue needs to shuffle along elements in one direction. It's doing comparisons along the way, and that can unwind. So instead of just doing log2(n)
shallow copies, it ends up needing to do 3 * log(n)
shallow copies in order to perform swaps. It's a significant performance hit. The drop flag allows performing log2(n)
shallow copies and log2(n)
zeroes, which is still worse than when unwinding isn't present but certainly better than swaps.
@pnkfelix more or less what @thestinger said. I personally would be inclined to model that situation with an Option<~T>
and swap it with None -- this is, after all, what the low-level code was doing, just exposed.
@nikomatsakis I guess my concern was that I did not want the use of Option<~T>
to leak out into client code ... the use case I am dealing with is when an object has been enqueued for finalization and I want to, for lack of a better term for now, "mess with its innards."
However, I just realized that I can probably safely transmute e.g. a ~(A, ~B))
to a ~(A, Option<~B>)
, and then I get the ability to do the mucking about internally, and the client who passed me the ~(A, ~B)
should be none-the-wiser.
Certainly we could still have the drop glue check for null even if WE no longer zero.
-------- Original message -------- From: Felix S Klock II notifications@github.com Date: 02/03/2014 15:50 (GMT-05:00) To: mozilla/rust rust@noreply.github.com Cc: Niko Matsakis niko@alum.mit.edu Subject: Re: [rust] remove the necessity of "zeroing out" from codegen (#5016)
@nikomatsakis I guess my concern was that I did not want the use of Option<~T> to leak out into client code ... the use case I am dealing with is when an object has been enqueued for finalization and I want to, for lack of a better term for now, "mess with its innards."
However, I just realized that I can probably safely transmute e.g. a ~(A, ~B)) to a ~(A, Option<~B>), and then I get the ability to do the mucking about internally, and the client who passed me the ~(A, ~B) should be none-the-wiser.
— Reply to this email directly or view it on GitHub.
cc @flaper87
I believe part of the implications of this work (which I have been planning on doing after the flowgraph-based dataflow lands) are to remove the drop-flag from our droppable struct and enum representations.
However, I have noticed some bug reports and RFCs filed recently that inject (or would inject) new dependencies on the drop flag. In particular RFC PR 99; and separately, potentially issue #14894 (depending on what solution the filer of that latter had in mind).
Meanwhile, in the time since this bug was filed and the above plan-of-action decided upon, we now have a formal RFC process.
So, I am now wondering whether I or someone else should file an actual RFC for this change, since it does represent a change to the semantics of drop from what we offer today.
Cc: @nikomatsakis
@pnkfelix with regards to RFC PR 99 I describe how to avoid depending on drop flags in this comment.
cc #17046
Per discussion in the meeting, nominating for removal from P-backcompat-lang, since we have decided on dynamic drop for the time being.
To be clear, there is still work to do here, just not backwards incompatible language work.
right; removal of zeroing is probably a P-backcompat-libs thing, right?
Assigning P-backcompat-libs, not 1.0 milestone.
P-high, not 1.0 (but one can always hope!)
Unless someone is already working on this, I'd like to give it a shot this weekend.
@dotdash I do not think this is a mere weekend project. Have you read the discussion on e.g. RFC 320 ?
@pnkfelix Oh, I don't expect to finish it over the weekend, but rather see if it's something that I could reasonably fit into my schedule at all. I hope to get that part done over the weekend.
For those interested, I have put up an internalspost describing some short term future proofing efforts, here:
http://internals.rust-lang.org/t/attention-hackers-filling-drop/1715
Linking relevant PR: Replace zeroing-on-drop with filling-on-drop #23535
What's the state of this?
I believe it is waiting for the MIR work in the compiler to shape up some more.
Programmers should be able to mutate structure memory without possibly interfering with drop flags -- for a 'systems programming language', the current implementation does not make sense, nor is it very efficient.
Along with a few other people, I'm eager to use Rust -- but problems like this make the language less appealing.
This issue is over three years old now and is still indeed is a pretty major pain when wrapping C libraries, effectively disallowing zero-copy conversions in a lot of cases and forcing one to jump through multiple ugly hoops.
In the cases where #[unsafe_no_drop_flag]
could help today, is it wise to rely on the assumption that there will come a time where droppable objects could be #[repr(C)]
?
As noted by @huonw above, at this point the compiler team considers this work to be dependent on MIR. The compiler team has made some baby steps forward on dynamic drop flags, but it is not the highest priority item right now. (Examples of things of higher priority: MIR itself; incremental compilation.)
The real point is that I don't want the age of this issue to be interpreted as meaning that the Rust leam isn't ever going to do this. We have every intention of doing it. I feel silly even having to write that down, but I worry that if no one says anything, then people are going to get an incorrect impression from the comment history here.
@pnkfelix Thanks for clarifying! Didn't mean to sound any negative, was mostly just wondering if it's wise for library developers to assume this will land in some (finite) time and make their choices accordingly :)
I expect it to land this year.
On Wed, Apr 06, 2016 at 05:39:55PM -0700, Ivan Smirnov wrote:
@pnkfelix Thanks for clarifying! Didn't mean to sound any negative, mostly just wondering if it's wise for library developers to assume this will land in some (finite) time and make their choices accordingly :)
You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/rust-lang/rust/issues/5016#issuecomment-206636214
:tada:
I have some code that spends a measurable amount of time checking for drop flags (there's an inner loop which drops an Option<Box<...>>
which is virtually always None
), and so I'm kind of wondering if we have a ticket for the other side of this, where we delete POST_DROP_U8 and stop generating code in drop methods to check for it. Do we have a tracking ticket for that?
Incidentally, this ticket is currently the tracking issue for the POST_DROP_U8 constant; maybe that should be repointed now that this ticket is closed?
(Tracking issue for RFC 320.)
Original Description
The current liveness check has a concept of "alive" and "maybe dead", so I'll just describe this in terms of a hypothetical "move optimization pass" (although liveness can probably be extended with "surely dead").
Hypothetical move optimization pass:
In each scope where they exist, all variables are given an associated boolean representing whether they were certainly moved from. Moving from a variable in a scope marks it as such for that scope. This can be bubbled up if and only if the variable is also moved from in every other branch.
If the compiler can bubble this up to the scope where the variable is declared, the drop glue can be omitted and all moves from that variable do not need to zero it.
This can probably be extended to fields too, but I'm not sure on what the semantics would be.
@nikomatsakis: does this look like something that would be reasonable to implement (probably as part of liveness)? It doesn't actually need to bubble it up to be useful - even if it only worked in a local scope, it would allow the destructor of the
save
variable in the TreeMapsplit
andskew
functions to be omitted (since it's moved from in the same scope where it's declared) and that would be a big performance win.