memphis-iis / datawhys-content-notebooks-python

Content for DataWhys in the form of JupyterLab notebooks (.ipynb files)
Apache License 2.0
8 stars 1 forks source link

Notebook: Descriptive statistics #4

Closed aolney closed 4 years ago

aolney commented 4 years ago

See the spreadsheet for details

Content Programming
DB AO

Ideas/prereqs: Central tendency, variability, aggregating per group (dplyr summarize), sampling, estimation error/error bars

Direct link https://jupyter.olney.ai/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fmemphis-iis%2Fdatawhys-content-notebooks&subPath=Descriptive-statistics.ipynb&app=lab

aolney commented 4 years ago

V1 done with https://github.com/memphis-iis/datawhys-content-notebooks/pull/31

aolney commented 4 years ago

@sdflem @andrewtawfik reopening to initiate any discussion before releasing this Sat

andrewtawfik commented 4 years ago

@aolney , thanks for getting this starting. As I reviewed it, a couple of things that might help. I wonder if we have some standard things that might help with those that are new. Some potential options.

  1. "What you will learn". This could essentially be some advanced Organizers to take advantage of the pretraining effect. It might be helpful to prime the learners about what follows. "In the section that follows, you will learn about descriptive statistics and how the following help us....

This could help w/ the cognitive load once something like sampling comes up about halfway.

  1. Another thing that might be helpful is to include a section in the top with something like:

"When to use" or maybe "What this helps with"

and later

"What it doesn't do". This could be helpful to prime the learners that it doesn't make judgments about causality, statistical significant difference, etc. This could help with their schema formation and help when those topics do come later. Something like:

"While this is helpful to provide an overview of the data and provides some standardization, we need to be careful about how much we use it and the judgments we make because it does not..."

  1. We had talked about the mental models developing and how quick it may have appeared. I don't know if it would be possible, but it might be really cool to have some kind of hover help and/or links.

For example, when they read the correlation notebook, there would be a line that says something like "This type of analysis requires the use of ratio data" and then hover over with a brief definition and possibly a link to the prior document. This could help with a flow theory perspective and maintain their train of thought, especially as things get more complicated

ddbowman commented 4 years ago

I agree with Andrew T. about the hover help. I had thought about this earlier but we were so rushed to get everything ready. In the description, where I thought a hover link would be good, I put the ideas in italics.

I like everything Andrew T. suggested except: I don't agree with the "what it doesn't do" section. In my experience when you include this type of idea the students remember it but not necessarily as something they should NOT do. Instead of making this an official section, maybe we could just include the information without emphasizing it as a formal section. Just a thought.

aolney commented 4 years ago

I looked into the hover issue, and unfortunately this isn't something we can directly put into the notebooks without making the incompatible outside our system ( 1; 2 ).

@ddbowman would you like to take a crack at writing the sections you endorsed:

?

ddbowman commented 4 years ago

Sure, I will work on it this weekend.

ddbowman commented 4 years ago

Sorry, I forgot you are trying to open these on Saturday. I will work on it this afternoon

aolney commented 4 years ago

Thanks :) If it helps, I will probably send out Sat afternoon.

ddbowman commented 4 years ago

I added the two sections. Can you take a look and see if it is too brief? :)

aolney commented 4 years ago

@ddbowman I'm not seeing your changes. If you used the Jupyter link at the top of the issue, did you commit and push the changes afterwards?

ddbowman commented 4 years ago

No I did not. I saved it only.

ddbowman commented 4 years ago

Hi Andrew, I am sorry but I don't know how to do that. Can you walk me through it? Here is what I added before the level of measure section. Best, Dale

What you will learn

In the sections that follow you will learn about descriptive statistics, in particular numerical EDA, and how they can help us learn about our data and what types of analyses may be appropriate. We will study the following:

When to use numerical EDA

Descriptive statistics, both numerical and graphical, are useful when you begin a data science project and you want to explore the data. Often insights will be gained that can be useful in further analyses.

aolney commented 4 years ago

Looks great :) I'll add and save to the repo.