plotters-rs / plotters

A rust drawing library for high quality data plotting for both WASM and native, statically and realtimely 🦀 📈🚀
https://plotters-rs.github.io/home/
MIT License
3.8k stars 275 forks source link

Histogram with multiple data sets. #403

Open RiwEZ opened 2 years ago

RiwEZ commented 2 years ago

What is the feature?

Make plotters::series::Histogram can handle multiple data sets like in matplotlib (examples)

Why this feature is useful and how people would use the feature?

This could help plotting histogram with multiple data sets much easier instead of plotting each rectangle by yourself like this code

pub fn draw_2hist(
    datas: [Vec<f64>; 2],
    title: &str,
    axes_desc: (&str, &str),
    path: String,
) -> Result<(), Box<dyn Error>> {
    let n = datas.iter().fold(0f64, |max, l| max.max(l.len() as f64));
    let max_y = datas.iter().fold(0f64, |max, l| {
        max.max(l.iter().fold(f64::NAN, |v_max, &v| v.max(v_max)))
    });

    let root = BitMapBackend::new(&path, (1024, 768)).into_drawing_area();
    root.fill(&WHITE)?;

    let mut chart = ChartBuilder::on(&root)
        .caption(title, ("Hack", 44, FontStyle::Bold).into_font())
        .margin(20)
        .x_label_area_size(50)
        .y_label_area_size(60)
        .build_cartesian_2d((1..n as u32).into_segmented(), 0.0..max_y)?
        .set_secondary_coord(0.0..n, 0.0..max_y);

    chart
        .configure_mesh()
        .disable_x_mesh()
        .y_desc(axes_desc.1)
        .x_desc(axes_desc.0)
        .axis_desc_style(("Hack", 20))
        .draw()?;

    // creating histograms
    let a = datas[0].iter().zip(0..).map(|(y, x)| {
        Rectangle::new(
            [(x as f64 + 0.1, *y), (x as f64 + 0.5, 0f64)],
            Into::<ShapeStyle>::into(&RED).filled(),
        )
    });
    // creating histograms
    let b = datas[1].iter().zip(0..).map(|(y, x)| {
        Rectangle::new(
            [(x as f64 + 0.5, *y), (x as f64 + 0.9, 0f64)],
            Into::<ShapeStyle>::into(&BLUE).filled(),
        )
    });

    chart.draw_secondary_series(a)?;
    chart.draw_secondary_series(b)?;

    chart
        .configure_series_labels()
        .position(SeriesLabelPosition::UpperRight)
        .label_font(("Hack", 14).into_font())
        .background_style(&WHITE)
        .border_style(&BLACK)
        .draw()?;

    root.present()?;
    Ok(())
}

This might also be the same thing #224 talking about too.

PSeitz commented 1 year ago

Similar to https://github.com/plotters-rs/plotters/issues/211

cszach commented 1 year ago

For those curious, this is what the above code produces:

output

The code is:

    draw_2hist(
        [
            vec![
                19.7, 22.21, 16.14, 11.0, 13.0, 26.28, 24.0, 9.1, 5.9, 10.25, 29.1,
            ],
            vec![
                30.25, 4.20, 18.23, 6.15, 2.0, 8.0, 3.19, 17.9, 27.1, 12.0, 1.0,
            ],
        ],
        "Example grouped bar chart",
        ("x", "y"),
        String::from("output.png"),
    );
wangjiawen2013 commented 5 months ago

Hi, I made a more comprehensive version of grouped barchart, which add real x labels and legend based on the above examples:

image

here is the full code:

// Construct a polars dataframe
let mut drug = df![
    "concentration" => &[2.1, 1.0, 3.2, 4.0, 5.5],
    "GAPDH" => &[1.4, 1.5, 1.7, 1.6, 1.9],
    "ACTB" => &[1.3, 1.4, 1.0, 0.9, 1.2],
    "SOX9" => &[0.5, 0.6, 0.7, 0.8, 0.9],
]?;
drug = drug.melt(["concentration"], ["GAPDH", "ACTB", "SOX9"])?;
drug.rename("variable", "genes");
drug.rename("value", "expression");

// Step 1: Extract the data we need
let mut dataset = drug.lazy()
    .select([
        col("concentration").cast(DataType::String),
        col("expression").cast(DataType::Float64),
        col("genes")
    ])
    .collect()?;
dataset.sort_in_place(["concentration"], false, true);

// Step2: Set x, y and groups and colors of each bar
let x = "concentration";  // x 轴数据所在列的列名,每组内的数据不能含有重复
let x_names = ["1.0 uM", "2.1 uM", "3.2 uM", "4.0 uM", "5.5 uM"];  // x 轴数据种类名字
let y = "expression";  // y 轴数据所在列的列名
let group = "genes";  // 组别所在的列名
let groups = ["GAPDH", "ACTB", "SOX9"];  // 组别
// 设置 bar 的宽度,默认为 0.1,同一坐标处的 bar 的数量与宽度的乘积要小于 1。
let width = 0.1;
// 设置 group1 各组的默认颜色,可选填来覆盖默认值
let colors: Vec<RGBColor> = (3..groups.len()+3).map(Palette99::pick).map(|x| RGBColor(x.rgb().0, x.rgb().1, x.rgb().2)).collect();

// Step 3: Generate  position offset for each group
let n = x_names.len();  // 每个横坐标处的 bar 的数量,即组别的个数
let m = groups.len();  // 一共有多少组
let y_max: f64 = dataset.column(y)?.max()?.unwrap();
let y_start = 0.0f64;
let y_end = 1.1 * y_max;
// 自动计算每个组的各 bar 的在横坐标处的位置补偿值
let offsets: Vec<f64> = (1..=m).map(|x| (x as f64 - 1.0 - m as f64/2.0) * width - 0.5).collect();

# Step 4: Plot
let mut colors = colors.into_iter();
let mut offsets = offsets.into_iter();
evcxr_figure((640, 480), |root| {
    // 配置画板,是独立存在的
    root.fill(&WHITE).unwrap();
    let mut chart = ChartBuilder::on(&root).margin(20)
        .set_label_area_size(LabelAreaPosition::Left, 65)
        .set_label_area_size(LabelAreaPosition::Bottom, 60)
        .caption("Scatter Plot", ("sans-serif", 30))
        .build_cartesian_2d(
            (1..n).into_segmented(),  // 分段坐标,用于表示离散变量
            y_start..y_end
        )?
        .set_secondary_coord(0.0..n as f64, y_start..y_end);
    // 绘制网格及坐标
    chart
        .configure_mesh()
        .x_desc(x)
        .y_desc(y)
        .axis_desc_style(("sans-serif", 25).into_font().color(&BLACK))
        .label_style(("sans-serif", 20).into_font().color(&BLACK))
        .x_label_formatter(&|x| match x {  // 把数字坐标替换为真实的名字
            SegmentValue::CenterOf(x) => x_names[*x as usize - 1].to_string(),
            _ => "unknown".to_string(),
        })
        .draw()?;
    // 产生数据点并绘制折线, izip! 接受的参数必须是引用
    for g in groups {
        let color = colors.next().unwrap();
        let offset = offsets.next().unwrap();
        let mask = dataset.column(group)?.equal(g)?;
        let dataset_subset = dataset.filter(&mask)?;
        //let x_data: Vec<&str> = dataset_subset.column(x)?.str()?.into_no_null_iter().collect();
        let y_data: Vec<f64> = dataset_subset.column(y)?.f64()?.into_no_null_iter().collect();
        let points = itertools::izip!((1..=n), &y_data);  // izip! 参数必须是引用或者range
        // draw_series 的参数实际上是一个个 element,此处 x 与 y 已经是分隔后的坐标了,所以要用 Exact 找回原来的坐标
        chart.draw_secondary_series(points.map(|(x, y)| {
            let mut bar = Rectangle::new([(x as f64 + offset, 0.0), (x as f64 + offset + width, *y)], color.filled());
            bar.set_margin(0, 0, 0, 0);  // 上、下、左、右的边缘大小
            bar
            }))?
            .label(g)  // 下面 legend 的 move 是为了获取 color 的所有权,以免被下一个图例覆盖,必须使用,否则报错
            .legend(move |(x, y)| Rectangle::new([(x, y-6), (x+12, y+6)], color.filled()));
    }
    // 绘制图例,循环完后打总绘制图例,所以上面要用 move,否则所有图例都按最后一个图例的信息来显示
    chart
        .configure_series_labels()
        .position(SeriesLabelPosition::UpperLeft)
        .margin(10)
        .label_font(("Calibri", 25))  // 后面也可以加 into_font()或 into_text_style(&root)
        .draw()?;

    Ok(())
}).style("width:60%")

We only need to set the x, x_names, y, group, groups in step 2 when using other dataset. step3 and step4 will be run automatically.

While I don't know how to set the distance among legends or how to set multi-column legends. So when there are more groups, the legends will get out of the scope of the figure.