developmentseed / supercluster-rs

Rust implementation of Supercluster for fast hierarchical point clustering
MIT License
22 stars 1 forks source link

WIP: added statistics #13

Open kylebarron opened 9 months ago

kylebarron commented 9 months ago

Manage and accumulate more information than just the count of the number of points

vaduga commented 3 months ago

Following this pull request commit: I'm struggling figuring out the idea behind the accumulator function and how to collect statistics for separate clusters (map of threshold_Id point property as key and the count of this id occurences in cluster points as a value).

I am just starting with Rust, would be very greatfull if you advice in my case. Otherwise very looking forward for a general implementation in the repo :)

pub trait Accumulator: Debug {
    fn init(&self, i: usize) -> Statistic;
    fn accumulate(&mut self, i: usize, value: usize);
}

#[derive(Debug, Clone)]
pub struct ThresholdIdsCounter {
    values: Vec<HashMap<usize, usize>>,
}

impl ThresholdIdsCounter {
    pub fn new() -> Self {
        let mut initial_map = HashMap::new();
        for i in 0..=10 {
            initial_map.insert(i, 0);
        }
         Self {
             values: vec![initial_map; 100], // total number of points
         }
    }
}

impl Accumulator for ThresholdIdsCounter {
    fn init(&self, i: usize) -> Statistic {
        // Get the value at index i, which is guaranteed to exist
        let value = self.values[i].clone();

        Statistic::new(value)
    }

    fn accumulate(&mut self, i:usize, value: usize) {
        if let Some(count) = self.values[i].get_mut(&value) {
           *count += 1;
        }
    }

}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Statistic {
    pub(crate) value: HashMap<usize, usize>,
}

impl Statistic {
    pub fn new(value: HashMap<usize, usize>) -> Self {
        Self { value }
        //todo!()
    }
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Statistics(pub(crate) HashMap<String, Statistic>);

impl Statistics {
    pub fn new(stats: HashMap<String, Statistic>) -> Self {
        Self(stats)
    }
}

impl Default for Statistics {
    fn default() -> Self {
        Self(HashMap::new())
    }
}
kylebarron commented 2 months ago

I haven't looked at this PR for quite a while. Honestly I don't remember exactly how to use the WIP code in this PR; I don't think it was fully functional