rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.
https://docs.rs/itertools/
Apache License 2.0
2.72k stars 309 forks source link

GroupMap: add fold_with #778

Closed tamird closed 12 months ago

tamird commented 1 year ago

This is a generalization of fold which takes a function rather than a value, which removes the need for a Clone bound. fold is implemented in terms of fold_with.

Philippe-Cholet commented 1 year ago

I'm not in favor of a breaking change without good reason. Plus, add fold_with seems way more reasonable. I would like another opinion on this before you work on this though.

phimuemue commented 1 year ago

If we accept this, I'd second @Philippe-Cholet's proposal and implement fold in terms of fold_with.

tamird commented 1 year ago

Done.

Philippe-Cholet commented 1 year ago

I think fold_with example should be updated to something that fold can not do by itself (not cloning copying some value). Otherwise, there would be no need for such generalization.

I'm wondering if FI should be with FnMut as well?!

tamird commented 1 year ago

Done. The example is a bit contrived, let me know if you have a specific suggestion please.

Philippe-Cholet commented 1 year ago

I did not thought of anything and Default::default in particular but I see it could be useful. However, the Accumulator struct is artificial and I came up with another idea, short and that could be useful: || Vec::with_capacity(64). The resulting vector would be clonable but the capacity of each clone would be 0 so using this closure with fold_with does something that fold(Vec::with_capacity(64), ...) would not be able.

tamird commented 1 year ago

That doesn't seem very compelling; such code would still compile albiet with slightly worse performance. I would prefer to demonstrate a use case that can't be written using fold at all.

Philippe-Cholet commented 1 year ago

For such small example, it sure is not very compelling. Looking at https://doc.rust-lang.org/std/clone/trait.Clone.html#implementors , it's not easy to find a non-clonable std type, leading to create our own type for this. I guess fold_with would be used on user-defined structs anyway. Let's go with your Accumulator. We should have one quicktest about this and it would be ready.

tamird commented 1 year ago

Done.

Philippe-Cholet commented 1 year ago

@phimuemue What do you think of this fold_with? The quickcheck test is an adaptation of the existing correct_grouping_map_by_fold_modulo_key.

phimuemue commented 1 year ago

As for the test, I'm fine with the copy-paste (it's a specific test meant to test one thing only, and the danger that these two tests diverge seems low).

I first thought that I'm fine with it API wise (it's in line with other _with-functions). But one question suggested itself: Should we - in the spirit that functions are either simple to use (such as the existing fold) or as general as possible (possibly the new fold_with), but not something in between - pass key and val to init?

Philippe-Cholet commented 1 year ago

I did not thought of that, it would definitely be more general. I'm wondering what would be the use case though, especially having &val.

But with the simple fold remaining as is, I guess we should be as general as possible with fold_with.

I would keep the doctest example simple though with |_key, _val| Default::default(). It suggests Default use instead of Clone while keeping "key/val" usages clear enough.

tamird commented 1 year ago

I did write this with a key parameter at first, but then realized the result is almost the same as aggregate, save for one Option.

I ended up using aggregate in my own code FWIW. Let me know what you prefer.

tamird commented 1 year ago

@Philippe-Cholet @phimuemue did you want this to take &K or is it suitable as is?

phimuemue commented 1 year ago

@Philippe-Cholet @phimuemue did you want this to take &K or is it suitable as is?

Since you already cooked up a version involving &K it seems that (at least) this parameter is needed. Let's go with the most general one, i.e. FnMut(&K, &V) - it's more general than accepting only &K while not being (much) more expensive.

I agree the function is very similar to aggregate, but it has its use: It lets you operate outside Option-land.

Philippe-Cholet commented 12 months ago

Sorry for the late response, I took a little break.

I did write this with a key parameter at first, but then realized the result is almost the same as aggregate, save for one Option.

I ended up using aggregate in my own code FWIW. Let me know what you prefer.

Maybe it's because I'm not very familiar with GroupMap but I don't see what you're saying. But apparently phimuemue does so not a real problem.

Otherwise agree with phimuemue.

tamird commented 12 months ago

Done.

phimuemue commented 12 months ago

Thank you. I'll merge this and make the closure accept &V, too.