Open Gankra opened 7 years ago
Maybe this doesn't need to include the kitchen sink, but it could have a range parameter, so that it's like a superset of drain. Any drawbacks to that? I guess adding bounds checking for the range is a drawback, it's another thing that can panic. But drain_filter(.., f) can not.
Is there any chance this will stabilize in some form in the not to far future?
If the compiler is clever enough to eliminate the bounds checks
in the drain_filter(.., f)
case I would opt for doing this.
( And I'm pretty sure you can implement it in a way
which makes the compiler clever eneugh, in the worst
case you could have a "in function specialization",
basically something like if Type::of::<R>() == Type::of::<RangeFull>() { dont;do;type;checks; return }
)
I know this is bikeshedding to some extent, but what was the reasoning behind naming this drain_filter
rather than drain_where
? To me, the former implies that the whole Vec
will be drained, but that we also run a filter
over the results (when I first saw it, I thought: "how is this not just .drain(..).filter()
?"). The former on the other hand indicates that we only drain elements where some condition holds.
No idea, but drain_where
sounds much better and is much more intuitive.
Is there still a chance to change it?
.remove_if
has been a prior suggestion too
I think drain_where
does explains it the best. Like drain it returns values, but it does not drain/remove all values but just such where a given condition is true.
remove_if
sounds a lot like a conditional version of remove
which just removes a single item by index if a condition is true e.g. letters.remove_if(3, |n| n < 10);
removes the letter at index 3 if it's < 10.
drain_filter
on the other hand is slightly ambiguous, does it drain
then filter
in a more optimized way (like filter_map) or does if drain so that a iterator is returned comparble to the iterator filter
would return,
and if so shouldn't it be called filtered_drain
as the filter get logically used before...
There is no precedent for using _where
or _if
anywhere in the standard library.
The "said equivalent" code in the comment is not correct... you have to minus one from i at the "your code here" site, or bad things happens.
IMO it's not filter
that's the issue. Having just searched for this (and being a newbie), drain
seems to be fairly non-standard compared to other languages.
Again, just from a newbie perspective, the things I would search for if trying to find something to do what this issue proposes would be delete
(as in delete_if
), remove
, filter
or reject
.
I actually searched for filter
, saw drain_filter
and kept searching without reading because drain
didn't seem to represent the simple thing that I wanted to do.
It seems like a simple function named filter
or reject
would be much more intuitive.
On a separate note, I don't feel as though this should mutate the vector it's called on. It prevents chaining. In an ideal scenario one would want to be able to do something like:
vec![
"",
"something",
a_variable,
function_call(),
"etc",
]
.reject(|i| { i.is_empty() })
.join("/")
With the current implementation, what it would be joining on would be the rejected values.
I'd like to see both an accept
and a reject
. Neither of which mutate the original value.
You can already do the chaining thing with filter
alone. The entire point of drain_filter
is to mutate the vector.
@rpjohnst so I searched here, am I missing filter
somewhere?
Yes, it's a member of Iterator
, not Vec
.
Drain is novel terminology because it represented a fourth kind of ownership in Rust that only applies to containers, while also generally being a meaningless distinction in almost any other language (in the absence of move semantics, there is no need to combine iteration and removal into a single ""atomic"" operation).
Although drain_filter moves the drain terminology into a space that other languages would care about (since avoiding backshifts is relevant in all languages).
I came across drain_filter
in docs as a google result for rust consume vec
. I know that due to immutability by default in rust, filter doesn't consume the data, just couldn't recall how to approach it so I could manage memory better.
drain_where
is nice, but as long as the user is aware of what drain
and filter
do, I think it's clear that the method drains the data based on a predicate filter.
I still feel as though drain_filter
implies that it drains (i.e., empties) and then filters. drain_where
on the other hand sounds like it drains the elements where the given condition holds (which is what the proposed function does).
Shouldn't linked_list::DrainFilter
implement Drop
as well, to remove any remaining elements that match the predicate?
Yes
Why exactly does dropping the iterator cause it to run through to the end? I think that's surprising behaviour for an iterator and it could also be, if desired, done explicitly. The inverse of taking only as many elements out as you need is impossible on the other hand because mem::forget
ing the iterator runs into leak amplification.
I've been using this function a lot and I always have to remember to return true
for the entries I want to drain, which feels counter-intuitive compared to retain()
/retain_mut()
.
On an intuitive logical level, it would make more sense to return true
for entries I want to keep, does anyone else feel this way? (Especially considering that retain()
already works this way)
Why not do that, and rename drain_filter()
to retain_iter()
or retain_drain()
(or drain_retain()
)?
Then it would also mirror retain()
more closely!
This is why I proposed to rename it to
drain_where
as it then is clear that:
It is a form of drain so we use drain in the name
By using where
in combination with drain
it is clear that the
elements where the predicate is true get drained, i.e removed and returned
It is more in sync with other names in std, normally if you have a
function consisting of two predicates you can emulate it (roughly) by using
functions representing each of the predicates in a "and then" fashion, e.g.
filter_map
can be emulated (roughly) as filter an then map
. The current
name indicates it is drain and then filter
, but it isn't even close to it
as it does not do a complete drain at all.
On Sun, Feb 25, 2018, 17:04 Boscop notifications@github.com wrote:
I've been using this function a lot and I always have to remember to return true for the entries I want to drain, which feels counter-intuitive compared to retain_mut(). On a primal level, it would make more sense to return true for entries I want to keep, does anyone else feel this way? Why not do that, and rename drain_filter() to retain_filter()?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/43244#issuecomment-368320990, or mute the thread https://github.com/notifications/unsubscribe-auth/AHR0kfwaNvz8YBwCE4BxDkeHgGxLvcWxks5tYYRxgaJpZM4OY1me .
But with drain_where()
the closure would still have to return true for elements that should be removed, which is the opposite of retain()
which makes it inconsistent..
Maybe retain_where
?
But I think you're right that it makes sense to have "drain" in the name, so I think drain_retain()
makes the most sense: It's like drain()
but retaining the elements where the closure returns true
.
While it does not change, that you have to return true it makes clear that you have to do so.
I personally would use either:
a. drain_where
b. retain_where
c. retain
I would not use drain_retain
.
Drain and retain speak about the same kind of process but from opposite
perspectives, drain speaks about what you remove (and r return) retain
speaks about what you keep (in the way it is already used in std).
Currently retaim
is already implemented in std with the major difference
that it is discarding elements while drain
is returning them, which makes
retain
(sadly) unsuited to be use in the name (except if you want to call
it retain_and_return
or similar).
Another point which speaks for drain is ease of migration, i.e. migrating
to drain_where
is as easy as running a word based search and replace on
the code, while changing it to retain would need a additional negation of
all used predicates/filter functions.
On Sun, Feb 25, 2018, 18:01 Boscop notifications@github.com wrote:
But with drain_where() the closure would still have to return true for elements that should be removed, which is the opposite of retain() which makes it inconsistent.. Maybe retain_where? But I think you're right that it makes sense to have "drain" in the name, so I think drain_retain() makes the most sense: It's like drain() but retaining the elements where the closure returns true.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/43244#issuecomment-368325374, or mute the thread https://github.com/notifications/unsubscribe-auth/AHR0kfG4oZHxGfpOSK8DjXW3_2O1Eo3Rks5tYZHxgaJpZM4OY1me .
But how often would you migrate from drain()
to drain_filter()
?
In all cases until now, I migrated from retain()
to drain_filter()
because there is no retain_mut()
in std and I need to mutate the element! So then I had to invert the closure return value..
I think drain_retain()
makes sense because the drain()
method drains unconditionally all elements in the range, whereas drain_retain()
retains the elements where the closure returns true
, it combines the effects of the drain()
and retain()
methods.
Sorry I mean to migrate from drain_filter
to drain_where
.
That you have a solution using retain
and then need to use
drain_filter
is a aspects I have not yet considered.
On Sun, Feb 25, 2018, 19:12 Boscop notifications@github.com wrote:
But why would you migrate from drain() to drain_filter()? In all cases until now, I migrated from retain() to drain_filter() because there is no retain_mut() in std and I need to mutate the element! I think drain_retain() makes sense because the drain() method drains unconditionally all elements in the range, whereas drain_retain() retains the elements where the closure returns true.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/43244#issuecomment-368330896, or mute the thread https://github.com/notifications/unsubscribe-auth/AHR0kSayIk_fbp5M0RsZW5pYs3hDICQIks5tYaJ0gaJpZM4OY1me .
Ah yes, but I think the "price" of inverting the closures in current code that uses drain_filter()
is worth it, to get a consistent and intuitive API in std and then in stable.
It's only a small fixed cost (and eased by the fact that it would go along with a renaming of the function, so the compiler error could tell the user that the closure has to be inverted, so it wouldn't silently introduce a bug), compared to the cost of standardizing drain_filter()
and then people always having to invert the closure when migrating from retain()
to drain_filter()
.. (on top of the mental cost of remembering to do that, and the costs of making it harder to find it in the docs, coming from retain()
and searching for "something like retain()
but passing &mut
to its closure, which is why I think it makes sense that the new name of this function has "retain" in the name, so that people find it when searching in the docs).
Some anecdotal data: In my code, I always only needed the retain_mut()
aspect of drain_filter()
(often they were using retain()
before), I never had use cases where I needed to process the drained values. I think this will also the most common use case for others in the future (since retain()
doesn't pass &mut
to its closure so that drain_filter()
has to cover that use case, too, and it's a more common use case than needing to process the drained values).
The reason why I'm agains drain_retain
is because of the way names are currently used in std wrt. collections:
drain
, collect
, fold
, all
, take
, ...*_where
, *_while
map
, filter
, skip
, ...)
map
vs. filter
/skip
)filter_map
apply modifier_1 and then apply modifier_2
, just that it's faster or more flexible doing this in one stepYou sometimes might have:
drain_filter
)
drain_where
)You normally do not have:
take_collect
as it is easily confusingdrain_retain
does kinda make sense but falls in the last category, while you probably can guess what it does it basically says remove and return all elements "somehow specified" and then keep all elements "somehow specified" discarding other elements
.
One the other hand I don't know why there should not be a retain_mut
maybe opening a quick RFC introducing retain_mut
as a efficient way to combine modify
+ retain
I have a hunch it might be faster
stabilized then this function. Until then you could consider writing a extension trait providing
you own retain_mut
using iter_mut
+ a bool-array (or bitarray, or...) to keep track of which elements
have to be reomved. Or providing your own drain_retain
which internally uses drain_filer
/drain_where
but wraps the predicate into a not |ele| !predicate(ele)
.
@dathinab
*_where
?
So we have to compare the naming scheme to methods that already exist on collections, e.g. retain()
, drain()
. There is no confusion with Iterator methods which transform one iterator into another iterator.retain_mut()
would not be added to std because drain_filter()
will already be added and people were advised to use that. Which brings us back to the use case of migrating from retain()
to drain_filter()
being very common, so it should have a similar name and API (closure returning true
means keep the entry)..I haven't been aware that retain_mut
was already discussed.
We are talking about general naming schemes in std mainly wrt. to collections, which does include Iterators as they are one of the main access methods for collections in rust, especially when it's about modifying more than one entry.
_where
is just a example for suffixes to express a slightly modified
function. Suffixes of this kind which currently are used in std are mainly
_until
, _while
, _then
, _else
, _mut
and _back
.The reason why drain_retain
is confusing is that it's not clear if it is
drain or retain based, if it is drain based, returning true would remove
the value, if is retain based it would keep it. Using _where
makes it at
last clear what is expected of the passed in function.
On Mon, Feb 26, 2018, 00:25 Boscop notifications@github.com wrote:
@dathinab https://github.com/dathinab
- We are talking about a method on collections here, not on Iterator. map, filter, filter_map, skip, take_while etc are all methods on Iterator. Btw, which methods do you mean that use *_where? So we have to compare the naming scheme to methods that already exist on collections, e.g. retain(), drain(). There is no confusion with Iterator methods which transform the iterator into another iterator.
- AFAIK the consensus was that retain_mut() would not be added to std because drain_filter() will already be added and people were advised to use that. Which brings us back to the use case of migrating from retain() to drain_filter() being very common, so it should have a similar name and API (closure returning true means keep the entry)..
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/43244#issuecomment-368355110, or mute the thread https://github.com/notifications/unsubscribe-auth/AHR0kfkRAZ5OtLFZ-SciAmjHDEXdgp-0ks5tYevegaJpZM4OY1me .
I've been using this function a lot and I always have to remember to return true for the entries I want to drain, which feels counter-intuitive compared to retain()/retain_mut().
FWIW, I think retain
is the counter-intuitive name here. I usually find myself wanting to delete certain elements from a vector, and with retain
I always have to invert that logic.
But retain()
is already in stable, so we have to live with it.. And so my intuition got used to that..
@Boscop: and so is drain
which is the inverse of retain
but also returns the removed elements and the usage of suffixes like _until
,_while
for making functions available which are just a slightly modified version of a existing functionality.
I mean if I would describe drain it would be something like:
remove and return all elements specified "in some way", keep all other elements where "in some way" is "by slicing" for all sliceable collection types and "all" for the rest.
The description for the function discussed here is the same except that "in some way" is "where a given predicate returns true".
One the other hand the description I would give retain is: only retain (i.e. keep) elements where a given predicate returns true, discard the rest
(yes, retain could have been used in a way where it does not discard the rest, sadly it wasn't)
I do think that it would have been really nice if retain
would have
passed &mut T
to the predicate and maybe returned the removed values.
Because I think retain
is a more intuitive name base.
But independent of this I also think that both drain_filter
/drain_retain
are suboptimal
as they do not make it clear if the predicate has to return true/false to keep/drain a entry.
(drain indicates true does remove it as it speaks about removing elements while filter
and retrain speaks about which elements to keep, at last in rust)
In the end it's not that important which of the names is used, it would be just nice if it gets stabilized.
Doing a poll and/or letting someone from the language team decide might be the best way to move thinks forward?
I think something like drain_where
, drain_if
, or drain_when
, is much clearer than drain_filter
.
@tmccombs Out of those 3, I think drain_where
makes the most sense. (Because if
implies either do the whole thing (in this case draining) or not
and when
is temporal.)
Compared to drain_filter
the closure return value is the same with drain_where
(true
to remove an element) but that fact is made clearer/explicit by the name, so it eliminates the risk of accidentally interpreting the meaning of the closure return value wrongly.
I think it’s more than time to stabilize. Summary of this thread:
R: RangeArgument
parameter be added?true
from the callback causes that item to be included in the iterator.)drain_where
.)@Gankro, what do you think?
The libs team discussed this and the consensus was to not stabilize more drain
-like methods at the moment. (The existing drain_filter
method can stay in Nightly as unstable.) https://github.com/rust-lang/rfcs/pull/2369 proposes adding another drain-like iterator that doesn’t do anything when dropped (as opposed to consuming the iterator to its end).
We’d like to see experimentation to attempt to generalize into a smaller API surface various combinations of draining:
RangeArgument
a.k.a. RangeBounds
) v.s. the entire collection (though the latter could be achieved by passing ..
, a value of type RangeFull
).Possibilities might include "overloading" a method by making it generic, or a builder pattern.
One constraint is that the drain
method is stable. It can possibly be generalized, but only in backward-compatible ways.
We’d like to see experimentation to attempt to generalize into a smaller API surface various combinations of draining:
How and where does the team foresee this type of experimentation happening?
How: come up with and propose a concrete API design, possibly with a proof-of-concept implementation (which can be done out of tree through at least Vec::as_mut_ptr
and Vec::set_len
). Where doesn’t matter too much. Could be a new RFC or a thread in https://internals.rust-lang.org/c/libs, and link it from here.
I've been playing around with this for a bit. I'll open a thread on internals in the next days.
I think a general API that works like this makes sense:
v.drain(a..b).where(pred)
So it's a builder-style API: If .where(pred)
is not appended, it will drain the whole range unconditionally.
This covers the capabilities of the current .drain(a..b)
method as well as .drain_filter(pred)
.
If the name drain
can't be used because it's already in use, it should be a similar name like drain_iter
.
The where
method shouldn't be named *_filter
to avoid confusion with filtering the resulting iterator, especially when where
and filter
are used in combination like this:
v.drain(..).where(pred1).filter(pred2)
Here, it will use pred1
to decide what will be drained (and passed on in the iterator) and pred2
is used to filter the resulting iterator.
Any elements that pred1
returns true
for but pred2
returns false
for will still get drained from v
but won't get yielded by this combined iterator.
What do you think about this kind of builder-style API approach?
For a second I forgot that where
can't be used as function name because it's already a keyword :/
And drain
is already stabilized so the name can't be used either..
Then I think the second best overall option is to keep the current drain
and rename drain_filter
to drain_where
, to avoid the confusion with .drain(..).filter()
.
(As jonhoo said above: )
what was the reasoning behind naming this drain_filter rather than drain_where? To me, the former implies that the whole Vec will be drained, but that we also run a filter over the results (when I first saw it, I thought: "how is this not just .drain(..).filter()?"). The former on the other hand indicates that we only drain elements where some condition holds.
I've opened a thread on internals.
The TLDR is that I think that non-selfexhaustion is a bigger can of worms than expected in the general case and that we should stabilize drain_filter
sooner rather than later with a RangeBounds
parameter. Unless someone has a good idea for solving the issues outlined there.
Edit: I've uploaded my experimental code: drain experiments There are also drain and clearing benches and some tests but don't expect clean code.
Totally missed out on this thread. I've had an old impl that I've fixed up a bit and copy pasted to reflect a few of the options described in this thread. The one nice thing about the impl that I think will be non-controversial is that it implements DoubleEndedIterator
. View it here.
@Emerentius but then we should at least rename drain_filter
to drain_where
, to indicate that the closure has to return true
to remove the element!
@Boscop Both imply the same 'polarity' of true
=> yield. I personally don't care whether it's called drain_filter
or drain_where
.
@Popog Can you summarize the differences and pros & cons? Ideally over at the internals thread. I think DoubleEndedIterator
functionality could be added backwards compatibly with zero or low overhead (but I haven't tested that).
How about drain_or_retain
? It's a grammatically meaningful action, and it signals that it does one or the other.
@askeksa But that doesn't make it clear whether returning true
from the closure means "drain" or "retain".
I think with a name like drain_where
, it's very clear that returning true
drains it, and it should be clear to everyone that the elements that aren't drained are retained.
Feature gate:
#![feature(extract_if)]
(previouslydrain_filter
)This is a tracking issue for
Vec::extract_if
andLinkedList::extract_if
, which can be used for random deletes using iterators.Public API
Steps / History
104455
Unresolved Questions
extract_if
accept aRange
argument?Send
+Sync
impls on linked list's ExtractIf, see commentSee https://github.com/rust-lang/rust/issues/43244#issuecomment-641638196 for a more detailed summary of open issues.