apple / swift-algorithms

Commonly used sequence and collection algorithms for Swift
Apache License 2.0
5.9k stars 435 forks source link

Add `grouped(by:)` and `keyed(by:)` #197

Closed amomchilov closed 9 months ago

amomchilov commented 1 year ago

Description

Pitch: https://forums.swift.org/t/pitch-sequence-grouped-by-and-keyed-by/65511 Swift Evolution proposal: https://github.com/apple/swift-evolution/pull/2078/files?short_path=2767014

Now that this is shipped, here's a GitHub search that shows usages in public repos.

Detailed Design

Include any additional information about the design here. At minimum, show any new API:

extension Sequence {
  func grouped<GroupKey>(
    by keyForValue: (Element) throws -> GroupKey
  ) rethrows -> [GroupKey: [Element]]

  func keyed<Key>(
    by keyForValue: (Element) throws -> Key,
    uniquingKeysWith combine: ((Key, Element, Element) throws -> Element)? = nil
  ) rethrows -> [Key: Element]
}

Documentation Plan

I've updated new Grouping.md and Keyed.md guides, and updated README to point to thme.

Test Plan

I wrote tests loosely based on the Standard Library's tests for Dictionary.init(grouping:by:). The Standard Library has a magical expectCrashLater() which lets you test fatalErrors and such. Without it, I'm not sure how to test properly test the fatalError() this PR introduces to keyed(by:).

Source Impact

Purely additive.

Checklist

amomchilov commented 1 year ago

do you think it would make sense to include a method like this (naming tbd) that incorporates/enables them both?

For the sake of flexibility, I would like to eventually also have this API, but even if it exists, I think specialized version for these two use cases (and tally) is worthwhile, to simplify the common case.

If you have any ideas in this regard, could you chime them in on the pitch thread?

I don't know that a crash is the right default behavior for keyed(by:) — for the dictionary initializer we require the unique parameter name so that an author is essentially asserting the right behavior. Do you think crashing there is okay without any other warning, or would a "last value wins" approach be acceptable? (That would match the behavior of most other implementations and replicate the trivial code that this method replaces.)

That's a very good point. Perhaps "last value wins" reduces crashes, but might let incorrectness slide by... IDK what's preferable. 🤷

amomchilov commented 1 year ago

Thanks for the feedback everyone! I'll try to incorporate it all this weekend

amomchilov commented 9 months ago

@natecook1000 That sounds good by me! I'll try to rework this on the weekend :)

amomchilov commented 9 months ago

Hey @natecook1000 @stephentyrone,

I rebased, and my latest commit changes from the original "throw on collision" behaviour to the new "keep latest" behaviour.

natecook1000 commented 9 months ago

@swift-ci Please test

stephentyrone commented 9 months ago

Thanks for your persistence on this one, @amomchilov!