seatonullberg / kernel-density-estimation

Kernel density estimation in Rust.
https://crates.io/crates/kernel-density-estimation
MIT License
27 stars 6 forks source link

Sampling is not accurate? #8

Open kchuangk opened 1 month ago

kchuangk commented 1 month ago

Taking the code sample below, the maximum sample is always below 5.0.

use kernel_density_estimation::prelude::*;

fn main() {

    let observations: Vec<f32> = vec![4.99,5.0,5.01];

    let kde=KernelDensityEstimator::new(observations,Scott,Normal);
    let pdf_dataset: Vec<f32> = (0..101).into_iter().map(|x| x as f32 * 0.1).collect();
    let sample=kde.sample(pdf_dataset.as_slice(),10000);
    println!("{:?}",sample);
}
kchuangk commented 1 month ago

There's definitely an issue on the cut off on the RHS - the max is always 0.5 Here's another sample code.

The summary stats for the sample are - multiple runs produce a similar result. 25%:0.4925 50%:0.495 75%:0.4974 max:0.5 mean:0.495 median:0.495 min:0.49

use kernel_density_estimation::prelude::*;

fn main() {

    let kde=KernelDensityEstimator::new(vec![0.4999999,0.50,0.5000001],Scott,Normal);
    let sample3=kde.sample(pdf_dataset.as_slice(),10000);
    // println!("{:?}",sample);
    let mut df = df!("tester"=>sample1,"tester2"=>sample2,"test_5"=>sample3).unwrap();
    let mut file = std::fs::File::create("tmp.csv").unwrap();
    CsvWriter::new(&mut file).finish(&mut df).unwrap();
}