wvwwvwwv / scalable-concurrent-containers

High performance containers and utilities for concurrent and asynchronous programming
Apache License 2.0
319 stars 16 forks source link

the trait `std::marker::Send` is not implemented for `*mut ebr::collector::Collector` #148

Closed CR7273868 closed 3 months ago

CR7273868 commented 3 months ago

Guard is not send?

CR7273868 commented 3 months ago

This issue is replicable when using Guard in a tokio::spawn(async move {});

CR7273868 commented 3 months ago

Guard has to keep it accessible into current thread. So that makes iteration for multithreaded async env not suitable with HashIndex?

changgyoopark-db commented 3 months ago

Hi,

Guard being not Send is totally intended because Guard has a reference to the thread local storage; the thread local storage may be gone if the thread is joined right after the Guard is sent to another thread.

If spawn complains about Guard not being able to be sent, 1. there must be an await during the lifetime of the Guard, or 2. the Guard is instantiated outside the supplied closure (async move {}).

changgyoopark-db commented 3 months ago

In other words, Guard can be used inside the closure if,

  1. Guard is instantiated "inside" the closure.
  2. There is no await during the lifetime of the Guard.
wvwwvwwv commented 3 months ago

I assume that what you want is iteration over HashIndex entries in an asynchronous closure containing await points. In that case, I'd suggest that you try the Entry API; OccupiedEntry::next_async allows you to iterate over entries.

let first_entry =  hashindex.first_entry_async().await.unwrap();
let second_entry = first_entry.next_async().await.unwrap();
...  
CR7273868 commented 3 months ago

Yeah but there can be loads of entries. And calling next on async every-time isn’t that exhaustive? I was using HashIndex for it’s iter capabilities. And then I used HashMap for it’s scan on before-hand. I removed HashMap, because I could minimize lines of code with the iter methods.

wvwwvwwv commented 3 months ago

This seems like adding scan to HashIndex would make sense in this case. What do you think of it? Making Guard send-able is not an option I’m afraid to say.

CR7273868 commented 3 months ago

A scan is definitely appreciated. But, maybe, if we’re going in that direction. Helper methods that are built on top of scan. Making it more niche to use. Because, allocating a mutable variable sprinkling that everywhere, seems a bit tedious for me. Although, the scan method did help me out. And maybe add a boolean or ControlFlow to the scan method. I know thay a library named HaxMap in Go does the boolean approach.

Scan is possible, but helper methods are definitely a good idea for finding values in an iterate. Or allow to skip, or use the entry api to skip until it found the desired entry.

changgyoopark-db commented 3 months ago

HashMap::any or HashMap::any_async would be the one that suits your usage, and they can be easily ported to HashIndex; would that be OK?

CR7273868 commented 3 months ago

Yeah I use HashMap. Not even HashIndex. I would just like to see more helpers methods for now I am using the entry api which is the next best thing.

CR7273868 commented 3 months ago

I haven't had time yet to fully consume the working of your library, but I made on helper method for anyone to grab or for u to integrate if it's any good. I am still understanding where everything goes. Once, I have done that. I can also fully help u on this library and also fix your issues in the meanwhile :

    /// Gets the first entry that matches the predicate for in-place manipulation.
    ///
    /// The returned [`OccupiedEntry`] in combination with [`OccupiedEntry::next`] or
    /// [`OccupiedEntry::next_async`] can act as a mutable iterator over entries.
    ///
    /// It is an asynchronous method returning an `impl Future` for the caller to await.
    ///
    /// # Examples
    ///
    /// ```
    /// use scc::HashMap;
    ///
    /// let mut hashmap: HashMap<char, u32> = HashMap::default();
    /// hashmap.insert('a', 1).await;
    ///
    /// let entry = hashmap.find_entry_async(|key, value| *key == 'a' && *value == 1).await;
    /// assert!(entry.is_some());
    /// if let Some(mut entry) = entry {
    ///     *entry.get_mut() = 2;
    /// }
    /// assert_eq!(*hashmap.get(&'a').unwrap(), 2);
    /// ```
    pub async fn find_entry_async<F>(&self, pred: F) -> Option<OccupiedEntry<'_, K, V, H>>
    where
        F: Fn(&K, &V) -> bool,
    {
        let mut current_entry = self.first_entry_async().await;
        while let Some(entry) = current_entry {
            if pred(entry.key(), &entry.get()) {
                return Some(entry);
            }
            current_entry = entry.next_async().await;
        }
        None
    }

I used your preexisting iterator behavior like entry API to make this happen.

CR7273868 commented 3 months ago

I added two other methods. The find expected is nothing more than a QOL. It's one that suits my needs. If u got any improvements for me to remove the Clone requirement let me know, but I haven't found a work around, because that reference from the nested hashmap (scc) outlives the parent.

    /// Gets the first entry that matches the predicate for in-place manipulation.
    ///
    /// The returned [`OccupiedEntry`] in combination with [`OccupiedEntry::next`] or
    /// [`OccupiedEntry::next_async`] can act as a mutable iterator over entries.
    ///
    /// It is an synchronous method returning an OccuipedEntry for the caller to manipulate.
    ///
    /// # Examples
    ///
    /// ```
    /// use scc::HashMap;
    ///
    /// let mut hashmap: HashMap<char, u32> = HashMap::default();
    /// hashmap.insert('a', 1).await;
    ///
    /// let entry = hashmap.find_entry_async(|key, value| *key == 'a' && *value == 1).await;
    /// assert!(entry.is_some());
    /// if let Some(mut entry) = entry {
    ///     *entry.get_mut() = 2;
    /// }
    /// assert_eq!(*hashmap.get(&'a').unwrap(), 2);
    /// ```
    pub fn find_entry<'h, F>(&'h self, pred: F) -> Option<OccupiedEntry<'h, K, V, H>>
    where
        F: Fn(&K, &V) -> bool,
    {
        let mut current_entry = self.first_entry();
        while let Some(entry) = current_entry {
            if pred(entry.key(), &entry.get()) {
                return Some(entry);
            }
            current_entry = entry.next();
        }
        None
    }

    /// Finds the first entry that matches the predicate and returns an expected type.
    ///
    /// This method iterates over the entries in the `HashMap` asynchronously,
    /// applying the predicate function to each entry. The predicate function returns
    /// a tuple containing a boolean indicating whether the entry matches and an optional result.
    /// If an entry matches, the function returns the result wrapped in an `Option`.
    ///
    /// The returned [`OccupiedEntry`] in combination with [`OccupiedEntry::next`] or
    /// [`OccupiedEntry::next_async`] can act as a mutable iterator over entries.
    ///
    /// This function is especially useful for nested `HashMap`s where you need to
    /// find and potentially manipulate nested entries based on certain conditions.
    ///
    /// It is an asynchronous method returning an `impl Future` for the caller to await.
    ///
    /// # Examples
    ///
    /// ```
    /// use scc::HashMap;
    ///
    /// let mut hashmap: HashMap<char, u32> = HashMap::default();
    /// hashmap.insert('a', 1).await;
    ///
    /// let result: Option<u32> = hashmap.find_entry_expected(|key, value| {
    ///     if *key == 'a' && *value == 1 {
    ///         (true, Some(*value))
    ///     } else {
    ///         (false, None)
    ///     }
    /// }).await;
    ///
    /// assert_eq!(result, Some(1));
    /// ```
    ///
    /// # Nested HashMap Example
    ///
    /// ```
    /// use scc::HashMap;
    /// use std::sync::Arc;
    ///
    /// let mut outer_map: HashMap<String, HashMap<String, Arc<RoomSupervisor>>> = HashMap::default();
    /// let mut inner_map: HashMap<String, Arc<RoomSupervisor>> = HashMap::default();
    /// inner_map.insert("inner_key".to_string(), Arc::new(RoomSupervisor::new())).await;
    /// outer_map.insert("outer_key".to_string(), inner_map).await;
    ///
    /// let result: Option<Arc<RoomSupervisor>> = outer_map.find_entry_expected(|_, inner| {
    ///     inner.find_entry(|key, value| {
    ///         if key == "inner_key" {
    ///             true
    ///         } else {
    ///             false
    ///         }
    ///     })
    /// }).await;
    ///
    /// assert!(result.is_some());
    /// ```
    ///
    /// # Type Parameters
    ///
    /// - `T`: The type to return if the predicate finds a match. This type must be derived from the key-value pairs within the `OccupiedEntry`.
    /// - `F`: The predicate function type, which takes references to the key and value, and returns a tuple of a boolean and an optional `T`.
    ///
    /// # Parameters
    ///
    /// - `pred`: A predicate function that takes references to the key and value, and returns a tuple where the first element is a boolean indicating if the entry matches and the second element is an optional result.
    /// Only works on `HashMap`s with keys and values that are `Clone` OR `Copy` (Arc, Rc, etc.)
    pub async fn find_entry_expected<'h, T, F>(&'h self, pred: F) -> Option<T>
    where
        T: Clone + 'h,
        F: Fn(&K, &V) -> (bool, Option<T>),
        K: 'h,
        V: 'h,
        H: 'h,
    {
        let mut current_entry = self.first_entry_async().await;
        while let Some(entry) = current_entry {
            let key = entry.key();
            let value = entry.get();
            let (found, result) = pred(key, value);
            if found {
                return result;
            }
            current_entry = entry.next_async().await;
        }
        None
    }
The documentation for find_entry_expected is wrong. Inside the callback u shouldn
changgyoopark-db commented 3 months ago

@Demy076 Thanks for the code snippets, I now understand what you wanted to achieve! Unfortunately, I am still quite busy these days, so not sure when I'll be able to incorporate the code. -> It will take ~2 weeks for me to release a new version including those new methods, I think. Would that be OK?

CR7273868 commented 3 months ago

@changgyoopark-db more than fine. Any improvements on how to surpass the clone? That's my only issue. To make it more broadly compatible without the necessary use of smart pointers. For types to prevent it from complaining that it outlives.

changgyoopark-db commented 3 months ago

I'll clean up the Clone bound - I don't think that is needed at all in the method definition since no 'clone' is explicitly called in it. In my opinion, passing OccupiedEntry<K, V> to pred, and letting pred return Option<OccupiedEntry<K, V>> can be an option: returning Some means the predicate is not satisfied for the entry.

CR7273868 commented 3 months ago

I'll clean up the Clone bound - I don't think that is needed at all in the method definition since no 'clone' is explicitly called in it. In my opinion, passing OccupiedEntry<K, V> to pred, and letting pred return Option<OccupiedEntry<K, V>> can be an option: returning Some means the predicate is not satisfied for the entry.

I don't think that's possible? Because, the expected function returns a value from the child map if one is in there. My original idea was to have it return a OccupiedEntry from nested map. But, yeah passing OccupiedEntry into pred is possible having that return to but if u return the same K, V it will just be the parent, which means u can indeed clean the Clone bound, but that causes that method to lose it's meaning.

changgyoopark-db commented 3 months ago

Yes, I'm aware of it, but I'm concerned about it being only meaningful for niche use cases. I still think you can achieve exactly the same with pred receiving Outer::OccupiedEntry<K, V> by using a side effect -

let mut result: Option<Inner::OccupiedEntry<InnerK, InnerV>> = None;
outer.find_entry_if(&self, |o: Outer::OccupiedEntry<OuterK, OuterV>| {
    if let Some(xxx) = o.find(...) { 
        result.replace(xxx);
        None
    } else {
        Some(o}
    }
}).await;

What do you think of it?

CR7273868 commented 3 months ago

Yes, I'm aware of it, but I'm concerned about it being only meaningful for niche use cases. I still think you can achieve exactly the same with pred receiving Outer::OccupiedEntry<K, V> by using a side effect -

let mut result: Option<Inner::OccupiedEntry<InnerK, InnerV>> = None;
outer.find_entry_if(&self, |o: Outer::OccupiedEntry<OuterK, OuterV>| {
    if let Some(xxx) = o.find(...) { 
        result.replace(xxx);
        None
    } else {
        Some(o}
    }
}).await;

What do you think of it?

I wasn't quite able to realize this just yet. I do understand what u'r trying to say. This would require a find_entryif method and another method to the OccupiedEntry<' K, V, H> struct. I'm not that deep into the library yet.

wvwwvwwv commented 3 months ago

Yep, anyway, I'll propose something useful in a couple of weeks and let you know!

CR7273868 commented 3 months ago

Yep, anyway, I'll propose something useful in a couple of weeks and let you know!

Thanks. If u would have a community discord / any form of communication (if there is time for that), for your SCC containers and libraries, that would be great. I would like to discuss and possibly learn from this as this seems super interesting. I've had more time growing into SCC before moving onto SDD.

wvwwvwwv commented 3 months ago

Uploaded a piece of code (https://github.com/wvwwvwwv/scalable-concurrent-containers/commit/dee0de7e353031a70eb35d0e594eefc512b80b0c) that is merely a helper method on top of first_entry. -> Still incomplete.

=> any_entry(_async)(pred: P) returns Option<OccupiedEntry> that satisfies the supplied predicate.

In your case where hash maps are nested,

let mut result: Option<Inner::OccupiedEntry<InnerK, InnerV, InnerH>> = None;
outer.any_entry_async(&self, |_, v| {
    if let Some(inner_entry) = v.get("inner_key") {
        result.replace(inner_entry);
        return true;
    }
    false
}).await;

-> Any feedback if welcome :-)

Thanks for your interest in the project. Unfortunately there is no communication channel for this crate other than this GitHub repo, because I just have no time for that.

CR7273868 commented 3 months ago

This looks very promising. I can share a version of my hashmap.rs. I didn't fork it.

Since I cloned and didn't fork it. I am most likely not able to make a PR. I added some QOL methods like filter. filter_expected. Values & values_async.

I removed clone bounds from methods. I will rework too by respecting u'r recent push for support nested maps, which can remove clone bounds set by me even more.

I will continue to expand on the most recent hash_map version. And fork that one.

New methods are (based on 2.1.1):

    pub fn filter_entries<'h, F, T>(&'h self, pred: F) -> Vec<T> // Should be filter_map_entries
    where
        F: Fn(&K, &V) -> Option<T>,
    {
        let mut current_entry = self.first_entry();
        let mut results = Vec::new();
        while let Some(entry) = current_entry {
            if let Some(result) = pred(entry.key(), &entry.get()) {
                results.push(result);
            }
            current_entry = entry.next();
        }
        results
    }

 pub async fn find_entry_expected<T, F>(&self, pred: F) -> Option<T> // should be find_map_entry_async
    where
        F: Fn(&K, &V) -> (bool, Option<T>),
    {
        let mut current_entry = self.first_entry_async().await;
        while let Some(entry) = current_entry {
            let key = entry.key();
            let value = entry.get();
            let (found, result) = pred(key, value);
            if found {
                return result;
            }
            current_entry = entry.next_async().await;
        }
        None
    }

    pub async fn values_async(&self) -> Vec<V>
    where
        V: Clone,
    {
        let mut values = Vec::new();
        let mut current_entry = self.first_entry_async().await;
        while let Some(entry) = current_entry {
            values.push(entry.get().clone());
            current_entry = entry.next_async().await;
        }
        values
    }

Some of these methods do not live up to their expectation, as filter would only filter out elements and can return a list of occupiedEntries. But when I tried this I got lifetime issues, and I was kinda racing against the clock for a project with this implementation. So, method names also need to be more thorough from my side. Of every method there is either a asyncronous or syncronous counter part.

CR7273868 commented 3 months ago
    pub fn find_entry<'h, F>(&'h self, pred: F) -> Option<OccupiedEntry<'h, K, V, H>>
    where
        F: Fn(&K, &V) -> Option<()>,
    {
        let mut current_entry = self.first_entry();
        while let Some(entry) = current_entry {
            if pred(entry.key(), &entry.get()).is_some() {
                return Some(entry);
            }
            current_entry = entry.next();
        }
        None
    }

This was my previous implementation hat will reworked, because yours handles option safety as well using an Outer<K,V,H> too.

wvwwvwwv commented 3 months ago

find_entry is close to what I uploaded a few days ago, except that any_entry* doesn't return OccupiedEntry; I'll think about it further if returning OccupiedEntry makes more sense.

On the other hand, sorry to say that I wouldn't accept the other three methods that you suggested in the previous comment, because,

changgyoopark-db commented 3 months ago

Hi, I just uploaded any_entry without changing much. I'd like to close this issue once I publish 2.1.2. Please create a new issue if you have any other inquires!