wiseaidev / rust-data-analysis

Rust for data analysis encyclopedia (WIP).
Apache License 2.0
338 stars 38 forks source link

[💄Feature]: Broken Code Discussion #17

Closed sgeos closed 1 year ago

sgeos commented 1 year ago

👶 Getting Started Please search the history to see if an issue already exists for the same problem.

📝 Describe the feature I'm working through the first notebook, and some of the code does not work. At this point, I have gotten the all the cells to compile and run in some form or another, but I have no idea if they are correct since I am not exactly sure what the goal is for some of these cells. This is not really a bug, and a PR seems premature. Is there a less formal channel to communicate about these issues? If not, what is the best way to address broken cells where the goal is unclear (PR, bug, etc.)?

📸 Screenshots N/A

🔦 Context

Concatenation. Needed 2D arrays to concatenate.

{
  let axis = Axis(1); // dropped at end of block
  let array_data_a = numeric_iris_ndarray.column(1).insert_axis(axis); // dropped at end of block
  let array_data_b = numeric_iris_ndarray.column(2).insert_axis(axis); // dropped at end of block
  concatenate(axis, &[array_data_a, array_data_b])? // returned by block
}

Separating Data. This is relatively clean, but I'm not sure sure what the intended output is. It should be easy enough get what you want from here.

let column_name = "Species";
let column_filter_value = "Iris-virginica";
let mask = iris_df.column(column_name)?.equal(column_filter_value)?;
let filtered_dataset = iris_df.filter(&mask);
filtered_dataset

Combination of Histogram and Scatter. Plotter histograms do not appear to accept floating point values. The plot output is clearly wrong, but all three datasets are being plotted. Hopefully, this is enough to get the plot right if the intended output is known.

evcxr_figure((640, 480), |root| {
    let root = root.titled("Scatter with Histogram Example", ("Arial", 20).into_font())?;

    let areas = root.split_by_breakpoints([560], [80]);

    let mut x_hist_ctx = ChartBuilder::on(&areas[0])
     .x_label_area_size(40)
        .y_label_area_size(40)
        .build_cartesian_2d(1..5, 3..9)?;
    let mut y_hist_ctx = ChartBuilder::on(&areas[3])
        .x_label_area_size(40)
     .y_label_area_size(40)
        .build_cartesian_2d(1..5, 3..9)?;
    let mut scatter_ctx = ChartBuilder::on(&areas[2])
        .x_label_area_size(40)
        .y_label_area_size(40)
        .build_cartesian_2d(1f64..5f64, 3f64..9f64)?;
    scatter_ctx.configure_mesh()
        .disable_x_mesh()
        .disable_y_mesh()
        .draw()?;
    scatter_ctx.draw_series(sepal_samples.iter().map(|(x,y)| Circle::new((*x,*y), 3, BLUE.filled())))?;

    let x_hist = Histogram::vertical(&x_hist_ctx)
        .style(RED.filled())
        .margin(0)
        .data(sepal_samples.iter().map(|(x,_)| (*x as i32, 1)));
    x_hist_ctx.draw_series(x_hist)?;

    let y_hist = Histogram::horizontal(&y_hist_ctx)
        .style(GREEN.filled())
        .margin(0)
        .data(sepal_samples.iter().map(|(_,y)| (*y as i32, 1)));
    y_hist_ctx.draw_series(y_hist)?;

    Ok(())
}).style("width:60%")

General feedback- personally, I like to have some output when a cell finishes running. This makes it easy to tell when a long (or short) running task is complete. For example:

:dep plotters = { version = "^0.3.0", default_features = false, features = ["evcxr", "all_series", "all_elements"] }
:dep polars = {version = "0.28.0", features = ["describe", "lazy", "ndarray"]}
:dep color-eyre = {version = "0.6.2"}
:dep ndarray = {version = "0.15.6"}
:dep smartcore = {version = "0.3.1"}
// or
// :dep polars = { git = "https://github.com/pola-rs/polars"}"}
// :dep polars = { git = "https://github.com/yaahc/color-eyre"}
// :dep polars = { git = "https://github.com/rust-ndarray/ndarray"}
println!("Dependencies installed."); // visual feedback

OR

let numeric_iris_df: DataFrame = iris_df.drop("Species")?;
numeric_iris_df // visual feedback
wiseaidev commented 1 year ago

Hey @sgeos,

Thanks for bringing up this issue. We truly appreciate it! If you have any ideas for improvement, no matter how big or small, don't hesitate to submit a pull request. We're all about collaboration, and every contribution counts!

By the way, if you want to chat further about this repository, we have a Gitter channel you can join. Feel free to jump in and let's have some clever discussions! Looking forward to hearing more from you.

Gitter

Best regards, Mahmoud

P.S. Your suggested changes look good to me! So, don't hold back – go ahead and submit that pull request! We're eagerly waiting to embrace your contributions.

wiseaidev commented 1 year ago

Hello there, @sgeos! Would you be up for sending in a PR? Your contribution would be greatly appreciated!

sgeos commented 1 year ago

Sorry, I moved onto other things, but there should be enough for an interested party to restart the notebook and work through it making the above changes.

wiseaidev commented 1 year ago

Thanks, @sgeos, once again, for improving this project!