swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.25k stars 10.32k forks source link

[SR-9870] Dictionary needs an API suitable for memoization #52276

Open dabrahams opened 5 years ago

dabrahams commented 5 years ago
Previous ID SR-9870
Radar None
Original Reporter @dabrahams
Type Improvement

Attachment: Download

Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 0 | |Component/s | Standard Library | |Labels | Improvement | |Assignee | @lorentey | |Priority | Medium | md5: d2bb9e67d575089c4ebdf00102db57df

Issue Description:

We need an API like this:

extension Dictionary {
    /// Returns `self[key] ?? computeDefault()`, inserting the result of
    /// `computeDefault` into the dictionary if it is called.
    subscript(
        key: Key, setDefault computeDefault: @autoclosure ()->Value
    ) -> Value {
        @inline(__always)
        mutating get {
            if let r = self[key] { return r }
            let r = computeDefault()
            self[key] = r
            return r
        }
    }
}

There's no way to code it outside the standard library that avoids doing a 2nd dictionary lookup.

belkadan commented 5 years ago

This works but doesn't really invalidate your request.

private func save<T>(_ x: inout T) -> T { return x}

extension Dictionary {
    subscript(
        key: Key, setDefault computeDefault: @autoclosure ()->Value
    ) -> Value {
        @inline(__always)
        mutating get {
            return save(&self[key, default: computeDefault()])
        }
    }
}

var x: [String: Int] = [:]
print(x["abc", setDefault: 1])
print(x["abc", setDefault: 2])
print(x)
belkadan commented 5 years ago

cc @lorentey

dabrahams commented 5 years ago

@belkadan That is truly evil. Have to say I'm impressed!

dabrahams commented 5 years ago

Heh, but at least on cursory examination it generates even worse code than the other two ways I tried and doesn't avoid rehashing 🙁

my versions: x0.S x1.S

Jordan's evil version: x2.S

lorentey commented 5 years ago

Yeah, this shouldn't need such contortions. A direct implementation would generate better code.

The good news is that Jordan's evil version only hashes twice on the slow path, when the insertion needs to resize the table. (In which case rehashing is necessary, as the dictionary switches to a new hash seed.)

Given that we already have a defaulting subscript with slightly different semantics, memoization would probably work better as a mutating method than a new subscript variant.

extension Dictionary {
  mutating func memoizeValue(forKey key: Key, _ computeValue: () -> Value) -> Value { ... }
}

var identifiers: [String: Int] = [:]
var nextId = 0
for string in ["Foo", "Bar", "Foo", "Foo", "Bar"] {
  let id = identifiers.memoizeValue(forKey: string) \{ let id = nextId; nextId += 1; return id }
  print(id)
}
// ⟹ 0 1 0 0 1

(I am not attached to this particular method name or signature, though.)

dabrahams commented 3 years ago

I just realized why I never thought of @belkadan's evil technique: the documentation for that subscript is wrong.

@lorentey: I have a bunch of other twisty non-memoization examples where I am currently forced to (ab)use that technique. I'll try to post links here when they are ready.

dabrahams commented 3 years ago

@lorentey here are the examples; look for “default:” in the code to find the places where the hack is used. I don't know if an API whose name involves “memoize” would really be appropriate for these uses.