brownplt / pyret-lang

The Pyret language.
Other
1.06k stars 106 forks source link

More Stats Functions #1732

Open ds26gte opened 2 months ago

ds26gte commented 2 months ago

Issue brownplt/code.pyret.org#520 filed by @schanzer

We've had a few teachers ask if Pyret supports various stats functions:

Getting these implemented as a Pyret program would be great, but implementing them as part of Pyret's stats library would be much better.

(In keeping with the other stats functions, these should all operate on lists. I'll wrap them to work with tables in the DS teachpack.)

shriram commented 2 months ago

Thanks, @ds26gte! Can you add some tests, please?

schanzer commented 2 months ago

@ds26gte awesome to see this progress! I'm still hoping we can add a z-test function as well (see checklist in the issue).

ds26gte commented 2 months ago

@team, the z-test seems to require, in addition to the two samples, also the population (rather than the sample) variances. Please add what you think are the right arguments for the z-test and the other functions that I've already added.

schanzer commented 2 months ago

@ds26gte waiting to hear back about the desired contract from one of the teachers who requested these functions, which should give me a sense for whether these are close enough to what they need that I couldn't bridge the gap in a teachpack. Will wait to hear back.

schanzer commented 2 months ago

@ds26gte I spoke with Nancy Pfenning today, who gave the following descriptions of what the inputs to various functions should be:

z-test: list of numbers, stddev, hypothesized mean t-test: list of numbers, mean 2-sample t-test: 2 list of numbers (can be different size), "tail-ness" (boolean operator? >,<, ≠?) paired t-test: 2 list of numbers (error if different size, order matters), "tail-ness" (boolean operator? >,<, ≠?) pooled t-test: 2 lists of numbers, "tail-ness" (boolean operator? >,<, ≠?) chi-squared: 2 lists of numbers (assumes pre-summarized data)

I think this is all inline with what you have, with the exception of the z-test. Can you double-check your implementation, and let me know why it has two lists of numbers?

schanzer commented 1 month ago

@ds26gte Sorry for the delay on this! I was hoping to hear back from the teacher who was requesting them, but they're overwhelmed with end-of-year stuff so I hopped on Zoom with Joy instead. :)

Below are the contract and purpose statements for the various functions that Bootstrap would export:

sample-variance :: Table, Column -> Number

pop-variance :: Table, Column -> Number

t-test-1-sample :: Table, String, Number -> Number

t-test-2-sample :: Table, String, String -> Number  #  this is the same as t-test-independent, so as long as one is implemented we're fine

t-test-paired :: Table, String, String -> Number 

t-test-pooled :: Table, Column1, Column2 -> Number

chi-sqr            :: Table -> p-value # consumes a 2-way Table of observed counts

chi-sqr-gof :: Table, Table -> p-value # consumes a 1-col Table of observed counts, and a 1-col Table of expected countrs

You'll want to replace Table in most of the contracts above with List, but for chi-sqr I'm assuming you want a list of lists? I'll wrap the functions in our library to keep everything in Table-land

ds26gte commented 4 weeks ago

(BTW, our naming needs to move away from contrasting linear against multiple. They are both linear -- it's actually single vs multiple.)

schanzer commented 4 weeks ago

@ds26gte good call. I propose linear-regression and multiple-linear-regression, possibly also with single-linear-regression as an alias for the first.

ds26gte commented 3 weeks ago

Looks like at least the googleable literature also contrasts linear against multiple. To be sure, multiple-regression desribes an n-dimensional plane, which is not, in a geometric sense, linear. On the other hand, even in the single-dimensional case, we can contrast linear against quadratic and other higher powers, which we don't use.

Essentially, our code and curriculum only deal with predictor functions that operate on one or multiple independent variables, but in both cases only take the first power of the independent variable(s). We want names that capture this and also don't mislead.

ds26gte commented 3 weeks ago

OK, apropos the various z-tests and t-tests, I don't think the things we're implementing are tests. Did we just want scores, in which case specifying the "tailness" as an argument makes no sense. The tailness is something you use along with the score in a subsequent (complicated) step for which we currently do not have code. This subsequent step could be automated, but it requires more coding.

The score gives us an abscissa to associate with our sample. The confidence level identifies one or two contiguous areas under the probability density function (normal, t, F, etc). The tailness is additional input that helps us identify this area. We then find the terminus abscissa associated with this area. Finally, we check if our own sample's abscissa is on the correct side of this terminus abscissa. So the test's result is a boolean.

As a coding task, what we need is the ability to find an abscissa given an area.

At a lower level, this means finding the root ("zero") of the difference of the integral of the function (with one integration bound varying) against a known area. This requires me to implement a suitable numerical integration function and a Newton-Raphson interpolation function. Both of which I can do, but it is a big undertaking, so...

Do we want to do this?

Could you check with Nancy or our curriculum goals. (The current texts don't mention anything, but maybe I'm not grepping expertly.)

ds26gte commented 2 weeks ago

Latest changes to z-, t- and chi- functions in commit 207d18b34.

Using test in the function names as spec'd. However, please consider changing it to score or value, since these give an x-value for the related probability density function.

Note: if the original spec setter did mean test, i.e., a boolean output is desired, then we need to add libraries for numerical integration, Γ, Newton-Raphson, and various prob density functions, as outlined above. This can be done and if anyone wants to review my prototype in Lua, do lmk. (Γ is an improper integral, but the numerical-integration routine can be adapted for it.)

Important: there is a non-glaring typo on the Investopedia website in its formula for the pooled t-test. So I've checked all the t-test-* functions against a paper textbook.