rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.93k stars 12.68k forks source link

HashMap struct key with `'static` lifetime screws with return lifetime using methods like `.get()` #124614

Open jsdw opened 5 months ago

jsdw commented 5 months ago

I'm not sure what words to use to actually describe (or search for a dupe) this issue, so I'll just post it here and hope that it makes more sense to you guys (and sorry if it is a dupe; I'm almost sure it would be)!

So basically, if I write this code:

use std::collections::HashMap;

#[derive(PartialEq,Eq,Hash)]
struct Key<'a> {
    key: &'a str
}

struct Foo {
    types: HashMap<Key<'static>, String>,
}

impl Foo {
    fn resolve<'this, 'a>(
        &'this self,
        key: &'a str,
    ) -> Option<&'this String> {
        self.types.get(&Key { key })
    }
}

I get back this error:

error: lifetime may not live long enough
  --> src/lib.rs:17:9
   |
13 |     fn resolve<'this, 'a>(
   |                -----  -- lifetime `'a` defined here
   |                |
   |                lifetime `'this` defined here
...
17 |         self.types.get(&Key { key })
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ method was supposed to return data with lifetime `'this` but it is returning data with lifetime `'a`
   |
   = help: consider adding the following bound: `'a: 'this`

Despite the fact that HashMap::get() returns the values with a lifetime of self, and nothing to do with the key provided.

However, this does work:

use std::collections::HashMap;

#[derive(PartialEq,Eq,Hash)]
struct Key<'a> {
    key: &'a str
}

struct Foo {
    types: HashMap<&'static str, String>,
}

impl Foo {
    fn resolve<'this, 'a>(
        &'this self,
        key: &'a str,
    ) -> Option<&'this String> {
        self.types.get(key)
    }
}

Removing the struct indirection when defining the key type seems to make things happy!

Background

The above is a minimal reproduction of the issue I ran into, but the reason I ran into it is because I want to have something like:

use std::collections::HashMap;

struct Foo {
    types: HashMap<(String,String), String>,
}

impl Foo {
    fn resolve<'this, 'a>(
        &'this self,
        key1: &'a str,
        key2: &'a str,
    ) -> Option<&'this String> {
        self.types.get(&(key1, key2))
    }
}

In other words, I want to be able to borrow more than 1 value to lookup some owned key in the hashmap. I went down the route of trying struct Key<'a> { a: Cow<'a,str>, b: Cow<'a,str> and then HashMap<Key<'static>, String>, but this led me to bumping up against this odd lifetime issue :)

Any pointers would be amazingly welcome!

pacak commented 5 months ago

I simplified the example a bit, nethier Foo nor Key are needed.

use std::collections::HashMap;

fn resolve<'this, 'a, T>(types: &'this HashMap<&'static str, T>, key: &'a str) -> Option<&'this T> {
    types.get(&key)
}
HeroicKatora commented 5 months ago

Duplicate of: https://github.com/rust-lang/rust/issues/80389#issuecomment-752067798 however the practical example here is amenable to a concrete possible fix, hence I'm expanding on the explanation of how to arrive there.

get(&Q) relies on the bound Borrow<Q> for K. You can use a reference to any query type Q provided that the stored keys can be borrowed as that query type. In Rust 'borrowed as' is integrated with the type system, using the Borrow trait, to say: these types have the same equality, hashing, comparison.

The standard library provides:

Fixing this is possible by ensuring the Borrow exists for different lifetimes, the way an indirect reference can. One layer of struct wrapper is required. It's not possible to write Borrow<Key<'b>> for Key<'a> directly since this overlaps with the reflexive impl given in the standard library.

#[derive(Hash, PartialEq, Eq)]
struct Key<'lt> {
    inner: &'lt str,
}

/// Wraps key so it can be `Borrow`'d under any short lifetime.
///
/// Avoid the conflicting implementation problem of using the same type.
#[derive(Hash, PartialEq, Eq)]
#[repr(transparent)]
struct KeyCapsule<'lt> {
    inner: Key<'lt>,
}

// Hash, PartialEq, Eq, are compatible due to the way single-attribute structs are derived.
// It just calls the methods on each field.
impl<'a, 'b: 'a> core::borrow::Borrow<Key<'a>> for KeyCapsule<'b> {
    fn borrow(&self) -> &Key<'a> {
        &self.inner
    }
}

fn with<'this, 'a>(map: &std::collections::HashSet<KeyCapsule<'this>>, key: &'a str) -> bool {
    map.contains(&Key {
        inner: key,
    })
}

fn main() {
    let mut map: std::collections::HashSet::<KeyCapsule<'static>>
        = Default::default();

    map.insert(KeyCapsule { inner: Key { inner: "Hello" }});
    assert!(with(&map, "Hello"));
}
jsdw commented 5 months ago

Thank you for the very insightful response @HeroicKatora!

As I switched to using hashbrown::HashMap for no-std support, I ended up finding the raw_entry API really useful for working around the issue and did something like this, which I'll leave here just in case it helps anybody:

// Define a key type to insert into a hashbrown::HashMap
#[derive(Clone, Debug, PartialEq, Eq)]
struct Key {
    a: String,
    b: String,
}

impl core::hash::Hash for Key {
    fn hash<H: core::hash::Hasher>(&self, state: &mut H) {
        hash_key(self.a, &self.b, state);
    }
}

fn hash_key<H: core::hash::Hasher>(a: &str, b: &str, state: &mut H) {
    a.hash(state);
    b.hash(state);
}

// And then, we can lookup using the raw_entry API like:
fn lookup<'map>(
    a: &str,
    b: &str,
    map: &'map HashMap<Key, Info>,
) -> Option<(&'map Key, &'map Info)> {
    // Manually construct the key hash
    let hash = {
        use core::hash::{BuildHasher, Hasher};
        let mut state = map.hasher().build_hasher();
        hash_key(a, b, &mut state);
        state.finish()
    };

    // Look up the entry by hash, and then compare keys we get back to find the matching key.
    map.raw_entry().from_hash(hash, |k| k.a == a && k.b == b)
}

Though at some point I might go back to using struct Key { a: Cow<'static,str>, b: Cow<'static,str> } plus your wrapper approach instead, which would avoid any hashing shenanigans (and the chance for something to go wrong there if fields are added or whatever).

Happy to see this closed anyways now. Thanks!