bootstrapworld / curriculum

6 stars 7 forks source link

Add dot plots to pyret library #2205

Open flannery-denny opened 1 month ago

flannery-denny commented 1 month ago

Here's what they look like / how we teach about them in CODAP: https://www.bootstrapworld.org/materials/spring2024/en-us/lessons/codap-dot-plots-bar-charts-codap/index.shtml?pathway=false

Added spec from Emmanuel:

We should add a dot-plot-series type, with fields for labels and color. If the data is categorical, the labels should be processed in the order they are given. If the data is quantitative, the min and max values should define a standard x-axis. For each location on the axis (or each label), the number of matching values should be represented as solid circles of whatever color was specified. Resizing the chart window should stretch the axes, but not deform the circles into ellipses.

ds26gte commented 2 weeks ago

I think this is a code.pyret.org issue since that's where this code would go

ds26gte commented 2 weeks ago

I've made a PR for this in

https://github.com/brownplt/code.pyret.org/pull/554

The corresponding CPO issue is https://github.com/brownplt/code.pyret.org/issues/469

Please follow developments there.

ds26gte commented 1 week ago

Since we want to generate dot plots for numerical x (in addition to categorical), and since the input type is different (list-of-numbers vs list-of-strings), so we want two separation dot-charts functions?

flannery-denny commented 1 week ago

Great question @ds26gte! My first thought was couldn't we hide that under the hood... but, since we provide students with the contract for the function, and the domains are different, we probably need to reveal that there are two separate functions?

@schanzer @retabak What do you think?

ds26gte commented 5 days ago
retabak commented 5 days ago

Okay, I've done some more research, and I think we should only have numerical dot plots. While obviously one could make a categorical dot plot, I don't think there's any real reason for us to do so. The dot plot is intended as a stepping stone to histograms, and is useful because we can use it to think about center and spread. Also, Illustrative Math says that it is a numerical display. So, that solves one problem!

Regarding your questions, @ds26gte ....

I have a feeling there's something I don't understand about how input data is provided for histograms. In Pyret, the user just provides a column name (but perhaps this is because of @schanzer's teachpack?). That's what I was hoping would happen for the dot plots, too. Suspecting it would be best for @schanzer to weigh in here.

I'm confused about the bin size question, because there are no bin sizes? Here are some dot plots that were made in CODAP, all from the Animals dataset. (I do hope our dot plots will look somewhat similar.)

image image image

schanzer commented 5 days ago

@ds26gte When the DS library passes data to from-list.histogram, it only checks to make sure the data is numeric. But there's nothing to prevent me from treating dot-plots differently! Let me know what you need, and I'll make sure you get it.

ds26gte commented 5 days ago

What we call "bar chart" Google actually calls "column chart". So, yes we can easily get what Google calls "bar chart", i.e., where the x axis is vertical

ds26gte commented 5 days ago

Regarding your questions, @ds26gte ....

I have a feeling there's something I don't understand about how input data is provided for histograms. In Pyret, the user just provides a column name (but perhaps this is because of @schanzer's teachpack?). That's what I was hoping would happen for the dot plots, too. Suspecting it would be best for @schanzer to weigh in here.

I'm confused about the bin size question, because there are no bin sizes? Here are some dot plots that were made in CODAP, all from the Animals dataset. (I do hope our dot plots will look somewhat similar.)

Pyret's histogram takes a list of numbers, in no order, and generally with repetitions. It then charts from the lowest to highest number, with the corresponding y lengths proportional to the number of that x's occurrences. Furthermore, it bins the x's into some convenient size, so each bin's y length is proportional to the sum of the occurrences of all the x's within that bin

ds26gte commented 2 days ago

I've added num-dot-chart to the PR https://github.com/brownplt/code.pyret.org/pull/554.

FYI: orthogonal to this, I've noticed that adding the method .horizontal(true) doesn't quite work -- it replaces the bar with an ugly single ellipse rather than replacing it with an appropriate number of small circles. I will need to change my SVG mungeing to adapt correctly to horizontal bar charts. 😦

schanzer commented 2 days ago

@retabak here's a demo of the dot plot implementation

FYI - the colors are customizable, but I'm not sure if the dots are. @ds26gte can weigh in on it. Please leave your feedback here!

retabak commented 2 days ago

Hi, @ds26gte ! This is getting closer! The dots look beautiful :)

I'm observing some other issues, however.

The motivation for teaching about dot plots to use them as a stepping stone to box plots and histograms,. This means that students need to be able to students need to be able to see gaps on dot plots + we need a consistent interval size on the x-axis.

Right now, the dot plot and histogram for weeks do not have the same shape (see below). They need to!

image

image

Then, one more image: There is definitely something weird happening in this dot plot for pounds. The y-axis needs to use whole numbers, as it shows count. (The x-axis is not allowing students to see shape, also.)

@schanzer - if you have anything to add or if I'm explaining poorly, please chime in!