Closed Kimundi closed 11 years ago
I think it's silly to have redundant functions/methods doing fail!()
internally every time there's a function/method returning Option
. In many cases it doubles the size of the API, making it harder to maintain and learn.
It would be much simpler to just use the unwrap
method on Option
as necessary.
@thestinger: I agree in principle, but the problem here is that methods on Option
that return an Option
for the element are kinda useless... If we only have unwrap()
, and have to chain that to every other possible call its going to be cumbersome to use.
I'm talking about everything returning an Option
that's not an Option
itself. The methods should only have to be defined once on Option
, we really don't need get
, expect
and so on to go along with methods like find
on every Map
, and have them in every variety find
comes in.
I've been mulling this over. My preference at this point is to do away with the "ref" and "consume" suffixes, and make "get" mean reference, and "unwrap" mean by value. So, today's option get becomes unwrap, unwrap stays the same, get_ref turns into get, map stays as map, map_consume turns into map_unwrap, and consume_iter turns into unwrap_iter.
bblum, unwrap
fails on None, map_unwrap
can't do that, it's not really useful.
My thought about that was "unwrap means might fail", but "map means won't fail" with higher precedence. I dunno, maybe it is better to keep that as consume, but I'd like to see one or the other word go away. Would it be any better to replace all "unwrap" with "consume"?
@bblum: I think we should prefer functions like .map() to consume self if we can, and let the caller use .clone() if they need a copy. That feels more rust-y to me. I've got an old patch series waiting for the recently landed fix for moving out of self that implements this for options, results, and vecs. I'll look into reviving it.
On Tue, Jul 23, 2013 at 8:01 PM, Ben Blum notifications@github.com wrote:
My thought about that was "unwrap means might fail", but "map means won't fail" with higher precedence. I dunno, maybe it is better to keep that as consume, but I'd like to see one or the other word go away. Would it be any better to replace all "unwrap" with "consume"?
— Reply to this email directly or view it on GitHubhttps://github.com/mozilla/rust/issues/7887#issuecomment-21460848 .
@bblum: looking back at my patch, I also added .map_ref() in the case you didn't want to pay the cost to clone the container. Hows does that pattern sound?
That (foo and foo_ref) was my original leaning, but then I thought about .iter()
. The vast majority of the time, (I expect) people will want ref_iter()
instead of consume_iter()
. But maybe the syntax sugar will save us in that case?
The other qualm I had with that idea is that ARC unwrap
should have a longer, more ominous name than get
. But maybe we can keep the name unwrap
for Option and for ARC, as a special case of implicit-consume that additionally means "might fail, be careful".
Keep unwrap for the case where you get "the whole content" out of a thing. I agree .map
, .map_mut
sound good for mapping by reference. What about .map_value
for the consume case? (EDIT: I talk only about Boxes, like Option
Kimundi, I don't really agree with any of your alternatives. Separate Option from the containers. Option is a building block for the latter.
A note to map()
, map_mut()
etc. Could we not just use an external iterator for that? Once it works properly it would end up as box.map()
and box.iter_mut().map()
Not for Option, since it's so central, it needs a lot of sugar
The challenge which approach do we pick for the shortest name? I converted .map()
to consume because I found most use cases could be implemented with .map_consume()
and I figured I'd change the shorter name to be the more efficient one. But since reference iterators are more general, it makes sense to keep .iter()
returning references.
What if we made the container's .map()
consume, and use .iter().map()
if you want the references? With mostly-working default methods, we could lift these implementations into some traits like this:
trait ConsumableIterator<T> {
fn map<U>(&self, f: &fn(T) -> U) -> ConsumableIterator<U>;
}
trait FromIterator<T> {
static fn from_consume_iter(i: ConsumableIterator<T>) -> Self;
}
trait Mappable<T>: ConsumableIterator<T> {
fn map<U, Container: FromIterator<U>>(self, f: &fn(T) -> U) -> Container<U> {
FromIterator::from_consume_iter(self.consume_iter().map(f))
}
impl<T> Mappable<T> for ~[T] {}
I don't think consuming is usually what you want for containers, and lots of them won't be able to do it.
@thestinger: How so? I'm using the term consume in this case to mean transferring the ownership of the contained elements from one container to another. Any container that allows for removing of elements (e.g. not bloom filters) should be able to support this kind of operation.
Consume would often be O(nlogn)
with a tree instead of O(n)
, and not possible with persistent containers.
I'm not sure if you would be able to write consume for a container like a list (chunked or not), without unsafe code that's likely to break if drop flags are removed.
@thestinger: By O(nlogn)
, are you counting both the deconstruction and reconstruction of a container? If just deconstruction, we can implement tree consumers in Rust in O(n)
if take advantage of unique pointers. It allows us to violate the tree algorithm invariants to unfold the tree structure. Here's a simple O(2n)
version to consume the tree:
enum Tree<T> {
Node(T, ~Tree<T>, ~Tree<T>),
Empty
}
impl<T> Tree<T> {
fn consume_iter(self) -> ConsumeIterator<T> {
fn aux(node: Tree<T>, mut values: ~[T]) {
match node {
Empty => values,
Node(value, left, right) => {
values = aux(left);
values.push(value);
values = aux(right);
values
}
}
}
aux(self, ~[]).consume_iter()
}
}
It wouldn't be that hard to optimize away values
, but we'd still need a stack to track where we are in the tree.
Regarding persistent containers, I totally agree, and it makes total sense for their .map()
to return references.
Regarding writing consume for a mutable singly linked list, you would need an intermediate structure that stores the consumed items in reverse order. All you need to do then is unreverse them. It'd also be O(2n)
. We could use dark magic to reverse the pointer directions to get it down to O(n)
but I'm not sure if it'd be worth the unsafety.
@erickt: I was thinking about the complexity of consuming and building up a new tree compared to just doing the operation in-place. Default methods like fold
would likely be based on iter
and I think by-reference is usually going to be desired.
I added a consume iterator to treemap recently.
(no unsafe, o(n), very tidy!)
More to the point of the bug, I think i prefer no suffix means by-ref, we use the suffix _move rather than take, consume or unwrap, and partial functions are funneled into either conditions or options with an _opt suffix.
@graydon: So .map_move()
? I can live with that.
Speaking towards the bug, do we need partial forms of functions like .take()
? We're starting to hit a combinatorial explosion of different function variants: by-ref / by-move / mut / default / map / expect. Instead I'd prefer if we just have up to 3 forms when necessary: by-move, by-immutable-ref, and by-mutable-ref. If we want a partial result, we can call Some(~"foo").map(|x| x.to_str()).expect(~"I expected some foo")
. It does result in an extra Option
allocation, but that should be both relatively cheap, and easy to optimize away with enough #[inline]
attributes.
Isn't take_unwrap()
used quite a lot in the codebase? Having an explicit take()
would allow to construct that call as take().unwrap()
.
One observation: Persistent collections will not be able to support a "consume" operation.
@nikomatsakis: Yeah, I figured you'd need a separate trait for each of these functions.
Does anybody have a succinct explanation of the current consensus on this subject for https://github.com/mozilla/rust/wiki/Doc-detailed-release-notes?
If we do #9355 (which I'd be totally in favor for), it would pretty much reduce all discussion here to get
vs unwrap
.
I'm kinda embarrassed that I never got the idea that you might not need three different versions of accessors for Option
... ;)
Many of the issues have been dealt with by a3b04c1 and the work towards consistency in the codebase. I've opened #9784 about get
vs. unwrap
and I think we can tackle any other issues in independent bugs.
Right now there is much confusion and inconsistency about the naming of functions that deal with elements in types that wrap one or more of them. We need to find a consistent way to deal with that.
First, some terminology:
Examples: Vector, Hash Map, Linked List.
Examples: Option, Cell, ~T, ARC.
I'm also defining a elementary set of operations you can perform on containers and boxes in regard to an arbitrary chosen element
E
(named in all-caps to differentiate from actual function names below):(&'a self) -> &'a E
(&'a mut self) -> &'a mut E
(&mut self) -> E
(self) -> E
In regard to naming schemes, I see a number of possibilities:
Unify boxes and containers
This would be the globally most consistent and reusable way, but also the most ugly one. We define a set of functions for working with an arbitrary 'next' element on an container. For example in stack-terminology:
fn top(&'a self) -> &'a E
- REF, fail if emptyfn peek(&'a self) -> Option<&'a E>
- REF or Nonefn top_mut(&'a mut self) -> &'a mut E
- REF_MUT, fail if emptyfn peek_mut(&'a mut self) -> Option<&'a mut E>
- REF_MUT or Nonefn pop(&mut self) -> E
- TAKE, fail if emptyfn pop_opt(&mut self) -> Option<E>
- TAKE or Nonefn pop_consume(self) -> E
- UNWRAP, fail if emptyfn pop_consume_opt(self) -> Option<E>
- UNWRAP or None.Separate boxes and containers, consistent box naming scheme
Containers and Boxes are separate concepts. Containers are build around a terminology and operations that make most sense to them. They probably wouldn't support UNWRAP and might not support the others either, depending on their specific semantics.
Boxes gain their own more concise and strictly consistent terminology, based around the box metaphor:
fn get(&'a self) -> &'a E
- REF, fail if emptyfn get_opt(&'a self) -> Option<&'a E>
- REF or Nonefn get_mut(&'a mut self) -> &'a mut E
- REF_MUT, fail if emptyfn get_mut_opt(&'a mut self) -> Option<&'a mut E>
- REF_MUT or Nonefn take(&mut self) -> E
- TAKE, fail if emptyfn take_opt(&mut self) -> Option<E>
- TAKE or Nonefn unwrap(self) -> E
- UNWRAP, fail if emptyfn unwrap_opt(self) -> Option<E>
- UNWRAP or None.Separate boxes and containers, convenient box naming scheme
This one is quite similar to the prior one, but shuffles names and definitions around to provide a nicer set of function names for the common operations:
fn get(&'a self) -> &'a E
- REF, fail if emptyfn get_opt(&'a self) -> Option<&'a E>
- REF or Nonefn get_mut(&'a mut self) -> &'a mut E
- REF_MUT, fail if emptyfn get_mut_opt(&'a mut self) -> Option<&'a mut E>
- REF_MUT or Nonefn take(&mut self) -> Option<E>
- TAKE or Nonefn unwrap(self) -> E
- UNWRAP, fail if emptyThe last one is probably the most convenient as it's optimized for common "I know it is not
None
"Option
usecases:opt.get()
- " Do something with the content of a Option, but keep it in place."opt.unwrap()
- "Move the content out of a Option, throw the Option away"opt.clone().unwrap()
oropt.get().clone()
- "Get a copy of the content"opt.take().unwrap()
- "Move the content out of a Option, replace the Option with None"Hm, this writeup is only half as comprehensive and cohesive as I wanted it to be, I hope it's still understandable.