Open camelid opened 3 years ago
intersperse_with
to this feature.I would like to suggest a simpler, alternative name for this method - weave
. At the time of comment, the word weave
is not yet used in Rust std.
Intersperse
is quite verbose (especially when compared to Vec::join
), but more importantly rarely used in common English. weave
is much more common and concise in comparison.
Scala and haskell both use intersperse
. Scalaz calls it intercalate
.
I would like to suggest a simpler, alternative name for this method -
weave
.
weave
is not on the list of terms I would search for when looking for the expected behavior. I'd even consider it vanity.
Intersperse
is quite verbose (especially when compared toVec::join
), but more importantly rarely used in common English.weave
is much more common and concise in comparison.
A verbose function name describes its effect better. More so if the term is uncommon; there is then less ambiguity. intersperse
does exactly what is says: intersperse elements with a common separator.
why not use join
and join_with
for iterators too?
Scala and haskell both use
intersperse
. Scalaz calls itintercalate
...
intercalate
is slightly different.
intersperse :: a -> [a] -> [a]
intercalate :: [a] -> [[a]] -> [a]
why not use
join
andjoin_with
for iterators too?
I feel like join
implies the creation of something monolithic from a sequence and a separator, i.e
let items = vec!["foo", "bar", "blah"];
let result = items.join(" ");
will create the string foo bar blah
, and I don't have any issue with that.
But in my mind intersperse
better communicates the idea of placing separators in between a sequence, without creating something monolithic using the results of that.
I suppose I mean that if I intersperse an iterator of &str
s with " "
, it makes sense to me that I get an iterator back.
But if I have an iterator of &strs
, and the method is named join
, I feel like it's a little less clear. Without looking further into it, it might seem like that I'd get a concatenated String
back, or similar.
For other types it may be less unclear but I'm still unsure on whether calling the method join
is a good fit.
I feel like
join
implies the creation of something monolithic from a sequence and a separator
That's true, but we are using vec::join
or slice::join
and that respects the pattern matching and it is not monolithic. Then vec::join
should be changed too. If Rust already acknowledges that join be non monolithic, then I suggest we stick with the same ideology.
But if I have an iterator of
&strs
, and the method is namedjoin
, I feel like it's a little less clear. Without looking further into it, it might seem like that I'd get a concatenatedString
back, or similar.
I'm always collecting to String
after intersperse
. 😅
My personal preference aside, vec::join
creates a vec
if the initial data type is not &str
. So it is just intersperse
which happens to create a String
in a specific scenario.
Personally, if it were not as verbose I would have gone for intersperse
. If join
was not used in vec
or slice
, I would not root for it. But weave
doesn't sound good IMO.
My main point is that let's not have two different terms do the same thing based on datatype.
Personally, if it were not as verbose I would have gone for intersperse. If join was not used in vec or slice, I would not root for it. But weave doesn't sound good IMO.
I did join
for Iterator
in the past https://github.com/rust-lang/rust/pull/75738 which accepts the current Join
that does both but it wasn't accepted because it requires allocation or Iterator
is in std
so it couldn't be implemented, splitting it as two functions works but may not be that discoverable.
In a sample code in helix,
let res = sel
.ranges
.into_iter()
.map(|range| format!("{}/{}", range.anchor, range.head))
.collect::<Vec<String>>()
.join(",");
How to join String
or &String
easily with &str
and such? I think cases more than that we need some intervention which the API will not be ergonomic. I think maybe it only works well with &'static str
.
In the current example, I believe it's too basic and limited, requires &'static str
and copied
.
let hello = ["Hello", "World", "!"].iter().copied().intersperse(" ").collect::<String>();
I think we need to think more about the ergonomics of the API before stabilization.
@rfcbot merge
Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
:bell: This is now entering its final comment period, as per the review above. :bell:
could call it "sep" for separator?
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
The RFC will be merged soon.
Re-opening this, as the FusedIterator
question is still open.
Is there any reason why we wouldn't want impl<I: FusedIterator> FusedIterator for Intersperse<I>
? If no one has anything, I may just open a PR for it. It would be pretty trivial.
I just saw that the stabilization was reverted, per #88967 and #89638.
By the way, could someone update the top post to reflect the stabilization revert?
@inquisitivecrystal I have now added a note about the revert.
I think (just guessing!) it would be realistic to use a build script in itertools, coordinate with stabilizations, and cfg-out methods that collide depending on the compiling rustc version. I mostly wonder if that's even a good enough solution - are there too many things breaking that use a lockfile or won't otherwise pick up a newer version of itertools anyway?
I mostly wonder if that's even a good enough solution - are there too many things breaking that use a lockfile or won't otherwise pick up a newer version of itertools anyway?
Only about 38% of crates.io downloads for itertools are for the latest release; the rest are for older versions. Given this, the added maintenance burden of checking the version of rustc might be mostly wasted effort. I also worry about the possibility that there might be subtle breaking changes between how an adapter works in Itertools vs how it works in the version ultimately stabilized in libcore. We'd have to keep a very close eye out for such differences to uphold semver. I'm not confident we'd get it totally right.
An alternative proposed on Zulip: Itertools could prefix every method with it_
. I dislike the annoyance-of-upgrading this poses and its aesthetics, but it would mitigate this issue permanently.
The breakage caused by Iterator::intersperse
spawned a lot of discussion about language-level solutions to this problem, some of which is relevant to other widely-used crates employing the extension trait pattern (e.g., moving methods from FutureExt to Future). I'd like to avoid making a change like prefixing every Itertools method with it_
if this is just going to be mostly fixed at a language-level anyways.
Or just use a different name like sperse
, intersparse
, or something.
Or just use a different name like
sperse
,intersparse
, or something.
I really cringe at the thought of playing any part in Rust having a #smooshgate every time someone wants to upstream an itertools adapter into libcore.
Another option is to integrate itertools in its entirety. Whether someone uses itertools or not becomes irrelevant.
Or just use a different name like
sperse
,intersparse
, or something.
That isn't viable whatsoever, as sperse simply means to disperse (and is a highly uncommon archaic form to the extent where I wasn't even aware it was a real thing) and intersparse is not standard English.
Another option is to integrate itertools in its entirety. Whether someone uses itertools or not becomes irrelevant.
I'm more in favour of this, depending on the popularity of specific adaptors. In any case, we need a concrete solution for stabilizing things without needing to worry about external crates. My only hope is that Rust's ecosystem doesn't become anything like JavaScript's, where nothing gets done because anything useful is considered breaking.
@Nytelife26 check the RFC where I mentioned this issue. I have detailed a possible approach. You can contribute there. It has to be done in a systematic manner aka the Rust way. One permanent solution for all similar problems.
As a newcomer to this, I'm still pretty confused: Why do we not offer a join function on iterators
i.join(",")
which is equivalent to
i.intersperse(",").collect()
?
This would be a huge ergonomic improvement. The fact that .join()
is only a function on slices, and not iterators, is non-idiomatic as the logic of the join function works on an iterator and doesn't require it to be a slice.
https://github.com/rust-lang/rust/pull/75738 I have created a pull request for that in the past. That is due to join
requires allocation, but currently the iterator trait is on core
, so we can't add it alongside the other iterators.
I would rather say, there is no need to have a shortcut, when iterator trait have already a LOT of method. Learning to combine iterator method is the way to go. i.intersperse(",").collect()
is not longer than any other code I know that collect the iterator. We can't add hundred of method on Iterator trait for every use case.
This would be a huge ergonomic improvement. The fact that .join() is only a function on slices, and not iterators, is non-idiomatic as the logic of the join function works on an iterator and doesn't require it to be a slice.
That a bold statement, I don't see how that not idiomatic to NOT have a join method on iterator. Use the very used and versatile collect IS the idiomatic way IMO.
My opinion on the breaking thing is that AFAIK, it was accepted and considered a minor change to add item, this is a 1 years unresolved question that didn't move, I don't think there is any solution, either we accept it or we give up. I don't see why we wait. Seem we wait for https://github.com/rust-lang/rust/issues/89151
@pickfire thanks for the answer, but .collect()
also allocates, right?
collect()
calls a FromIterator
impl, which is provided by the collection being collected into. So there is no allocation code in the definition of the Iterator
trait
Right, so how is join different? Here's the definition for collect:
fn collect<B: FromIterator<Self::Item>>(self) -> B
where
Self: Sized,
{
FromIterator::from_iter(self)
}
And here's join:
fn join<B: FromIterator<Self::Item>>(self, separator: Self::Item) -> B
where
Self: Sized,
Self::Item: Clone,
{
FromIterator::from_iter(self.intersperse(separator))
}
This requires stable intersperse
, but even without it, join
should require no allocation within the Iterator trait itself.
Inclusions to the standard library must meet very high bars for ergonomics and performance. Meeting those standards for Iterator::join
is going to be tricky.
In many situations, the implementation you propose above is going to be less efficient than writing:
iter.collect::<Vec<_>>.join(sep)
...since that usually does not require cloning the separator. This alternative is also more flexible, as it does not require that the separator type and the item type are the same.
If it were as simple adding that alternative to Iterator
:
fn join<S>(self, sep: S) -> <[T] as Join<Separator>>::Output
where
Self: Sized,
[Self::Item]: Join<S>,
{
iter.collect::<Vec<_>>.join(sep)
}
...we might do so! Unfortunately, Iterator
is defined in core
, and Join
is defined in std
. We can't simply move Iterator
to std
(since that would break no_std
applications) and we can't simply move Join
to core
(due to coherence issues).
I see what you're saying, but the most common case is Separator = &str
, which only clones the &str pointer (not the string itself). I'm not aware of the use cases where iter.collect::<Vec<_>>.join(sep)
is more efficient -- on the contrary, I think it's most common that this is less efficient, due to allocation, and users are doing it anyway as it's the easiest way to join a &str
in the standard library without using itertools. Would love to see some data though. Thanks for the answer.
I keep running into places where I would like intersperse
rather than .collect::<Vec<_>>().join()
. I never run into reasons to use other itertools helpers, by in large.
Which is to say I hope this can be stabilized soon. (I think intersperse
is a logically correct thing to call this, since that's exactly what it does.)
Is my understanding correct that this is waiting on https://github.com/rust-lang/rust/issues/89151 which was an accepted RFC but never had implementation work yet?
I expected this code to compile
let intersperse = s.chars()
.enumerate()
.map(|(index, value)| {
let lowercase = iter::once(value.to_ascii_uppercase());
let uppercase = iter::repeat(value.to_ascii_lowercase()).take(index);
lowercase.chain(uppercase)
}).intersperse(iter::once('-'))
Instead, I need to create such boilerplate because I need to have exactly the same input type
let IteratorOfCharIterators= s.chars()
.enumerate()
.map(|(index, value)| {
let lowercase = iter::once(value.to_ascii_uppercase());
let uppercase = iter::repeat(value.to_ascii_lowercase()).take(index);
lowercase.chain(uppercase)
}).intersperse(iter::once('-').chain(iter::repeat('0').take(0)));
Or what should I do in similar situations?
What is the status of this feature?
@c410-f3r as I mentioned above it seems to be blocked on requiring https://github.com/rust-lang/rust/issues/89151
Thank you @Fishrock123
The plan we arrived upon is to first revert the stabilization and then plan on stabilizing the feature in a future release, @Mark-Simulacrum suggested 1.65.
Leaving aside the above initial plan described in https://github.com/rust-lang/rust/issues/88967#issuecomment-938024847 and taking into consideration that https://github.com/rust-lang/rust/issues/89151 appears to be stalled, this feature will likely take some years to be stabilized, if ever.
Yes, it looks like someone interested would need to take up working on https://github.com/rust-lang/rust/issues/89151
This has been open for years. The nice solution of changing trait resolution had no progress for years either, and may be blocked by other refactorings too.
Could we reconsider renaming the method?
Rust has the doc(alias)
feature, so it can still be findable by this term.
"intersperse" seems to be an uncommon word. English is not my native language, and I don't recall ever hearing that word in a conversation. It's not a word I'd use to find this function, and the name to me doesn't explain what the function does, so personally I don't care for that term.
How about separate_by
or insert_between
? I think these are simpler names that accurately explain this function.
Or lace
, interlace
, weave
, interweave
, intertwine
.
My personal bikeshed-report: I'd suggest interweave
. Unlike most alternatives, the verb addresses an action done to the iterator (as opposed to an action done to the elements, e.g. seperated_by
), and doesn't sound unnecessarily mathematical (e.g. interpolate
) or skeuomorphic (e.g. sprinkle
, interlace
, amalgamate
, commingle
)
I think separate(d)_by
is a really good name. Unlike intersperse
, interlace
and interweave
it's actually obvious that it takes a separator, not a second iterator like Itertools::interleave
.
We could also consider calling it join
, like the similar method Vec::join
. It's straightforward and matches names in other languages (like Python). The only downside could be confusion with Vec::join
, but there are already a couple of existing methods that have the same name between Iterator
and Vec
(e.g., first
and last
).
We could also consider calling it
join
, like the similar methodVec::join
. It's straightforward and matches names in other languages (like Python). The only downside could be confusion withVec::join
, but there are already a couple of existing methods that have the same name betweenIterator
andVec
(e.g.,first
andlast
).
.join()
is a misnomer on Vec
imho, as the term alludes to an action done to the elements ("the elements are joined using the argument"), as opposed to the Vec
("the Vec
is joined with a second Vec
; which join()
doesn't actually do).
If someone can make a decision, I'm willing to change the Stone Of Shame around my neck for the Stone Of Triumph and submit a PR.
I agree that Vec::join
has a confusing name... however, I'm unclear if that means you agree that Iterator::join
is a good name. It's obviously too late to change Vec::join
, so if Iterator::join
would match it and be no worse a name, it seems like a potentially good option. Do you mind clarifying?
It can't be join
, because that's already taken by a slightly different operation that also collects. Vec::join("")
returns a single joined String
. So does Itertools::join
.
Both join functions call their argument a separator. Therefore I think Iter::separate
(_by
) makes the most sense for the iterator operation that happens under the hood of join.
I see, that makes sense. It does seem like Iter::separate
or separate_by
are best then. Perhaps we do intersperse -> separate
and intersperse_with -> separate_with
?
My personal favorite would be separate_with
and nothing else. The version accepting a closure is more powerful and those couple characters won't hurt anybody. I look at it like this:
.separate_with(|| ",")
.separate(",")
The only purpose of having a separate
version seems to me to save a couple (8) characters. Not worth it to clutter the standard library documentation in my opinion. But it doesn't matter that much either. Having both is fine too.
@camelid I was arguing for not using .join()
, my bad.
fn separate() -> Separate
and fn separate_with() -> SeparateWith
it is, then?
I think fn insert_between
works best, particularly if we only have the closure form, but I don't mind separate_with
. I agree that we only really need one form.
This is a tracking issue for the
Iterator::intersperse
andintersperse_with
methods. It is based onitertools::intersperse
and is the iterator equivalent ofVec::join
.The feature gate for the issue is
#![feature(iter_intersperse)]
.Public API
Steps / History
intersperse
: #79479intersperse_with
: https://github.com/rust-lang/rust/pull/80567impl<I: FusedIterator> FusedIterator for Intersperse<I>
? (Can be added after stabilization, so not urgent.)Unresolved Questions
None.