Statistical Inference for One Population Standard Deviation: Hypothesis Testing

ashokkrish commented 1 month ago

@bryce-carson When the user selects

and

and enters the values

and selects

Have a numericInput() with bolded header

Hypothesized Population Standard Deviation (sigma_0) Value

with a default value of 0.15

followed by the existing

Worked out solutions for several exercises are given here (note: in some cases the hypothesis test is carried out about the population variance (sigma^2),

ashokkrish commented 1 month ago

bryce-carson commented 1 month ago

I think we should include a little extra information in the drop-down box regarding the alternative hypothesis to guide students in selecting the correct alternative hypothesis.

Students must understand the implied alternative hypothesis when reading word problems. For example,

If the machine must have $\sigma \le 0.15\ cm$, then $H_0: \sigma \gt 0.15\ cm$

We could have small parenthesized phrases in the drop-down box to signal which alternative hypothesis to select.

https://resources.nu.edu/statsresources/hypothesis

bryce-carson commented 1 month ago

Personally, I need to review which is which. I know we cannot prove our hypothesis, we are forced to accept a state of being as true or likely or reasonable by rejecting the null hypothesis, and we are forced to accept our desired alternative hypothesis. That is not necessarily the correct alternative hypothesis.

ashokkrish commented 2 weeks ago

@bryce-carson The three numericInput fields must be validated.

Sample Size (n), Sample Standard Deviation (s) and Hypothesized Population Standard Deviation (sigma_0) Value must be strictly greater than zero.

Sample Size (n) can take only integer values only.

bryce-carson commented 2 weeks ago

@bryce-carson The three numericInput fields must be validated.

Sample Size (n), Sample Standard Deviation (s) and Hypothesized Population Standard Deviation (sigma_0) Value must be strictly greater than zero.

Sample Size (n) can take only integer values only.

Roger that.

bryce-carson commented 1 week ago

@bryce-carson The three numericInput fields must be validated.

Sample Size (n), Sample Standard Deviation (s) and Hypothesized Population Standard Deviation (sigma_0) Value must be strictly greater than zero.

Sample Size (n) can take only integer values only.

Only $n$ and $s$ are user inputs for a statistical inference of the population standard deviation. Regardless, I can check the value of the hypothesized $\sigma_0$.

bryce-carson commented 1 week ago

For some reason the user interface is not displaying for this now. I need to see why that is.

bryce-carson commented 1 week ago

A confidence interval or hypothesis test inferance for one sample is not displaying for any parameter of interest. Only the title representing the selected inference type is displaying.

It might be an issue with the namespaces, but I doubt it because I haven't changed that and I wasn't having issues with display before.

ashokkrish commented 1 week ago

@bryce-carson > value of the hypothesized σ0 .

This is shown only when the user selects

bryce-carson commented 1 week ago

@bryce-carson > value of the hypothesized σ0 .

This shown only when the user selects

Ah, makes sense! Thanks.

bryce-carson commented 1 week ago

@ashokkrish, I've made decent progress on this issue's main panel display. I'm working on getting the validation for the new numericInput to work, and the conditionalPanels to work for the various alternative hypotheses.

Can you take a look at the screenshot below and let me know if you'd like any display changes in the three alternate displays of the formulas? (Don't forget, not all three would be displayed; they're only displayed together now during early development of this new feature). The feature could be completed tonight if we don't fight too much with formatting, but I know that's important.

annotated Screenshot at 2024-11-20 19-52-34

I should have written "like the below annotations, these annotations are respective to the selected alternative hypothesis." "Likewise [the referenced thing]" doesn't make sense.

ashokkrish commented 1 week ago

@bryce-carson

For the Test Statistic (TS) value calculation you don't need to include the subscripts at all

The TS formula is the same irrespective of whether the test is left-, right- or two-sided.

See one sample t-test for example.

bryce-carson commented 1 week ago

The flowchart is not applicable to this test then, @ashokkrish?

It appears to me to show different test statistic values for left-tailed, two-tailed, and right-tailed tests, at least at first. After review the flowchart I see I am wrong and that what it is showing is different rejection criterion; where are the rejection criterion acquired from the critical value obtained with critVal <- round(qchisq(1 - sigLvl, df = data$Results$parameter), cvDigits), yes?

bryce-carson commented 1 week ago

@ashokkrish, you can ignore the last comment. I realized my mistake and I understand now what you mean, fully, I think.

bryce-carson commented 1 week ago

@ashokkrish, I have added a commit which implements almost all of the formatting that you want. I worked at getting the number of line breaks identical to the number in the population mean. Let me know if there are any mistakes with the formula; for now, disregard the values, they are placeholders (see the third task list item).

[x] Input validation,
[x] Interpretations for alternative hypotheses, and
[x] Substitute values in test statistic calculation for input values.

What is completed is the conditional display, the non-conditional display, and the LaTeX.

bryce-carson commented 6 days ago

@ashokkrish, I am working on the interpretations for alternative hypotheses, learning to use the functions that already exist in the statInfr file to ensure the formatting is the same and that a plot of the rejection region is created in the same manner. I should be done by tomorrow afternoon when we meet, I hope (if I get up early enough because I slept well).

ashokkrish commented 6 days ago

H0: Ha:

alpha

Test Statistic

Given

Using P-Value Method:

Using Critical Value Method:

Conclusion:

bryce-carson commented 1 day ago

@ashokkrish, the P-value approach is a little confusing. Which values should I use? I did most of the UI work and set the values correctly for the most part, but I know some of them are the wrong values.

Just committed. Please let me know what to change and I'll get that done tomorrow, which will be a long shift for me so I should have everything done by the end of the day (finally!).

I read this, but I didn't understand it much because its formatting is bad. https://online.stat.psu.edu/statprogram/reviews/statistical-concepts/hypothesis-testing/p-value-approach

ashokkrish commented 1 day ago

You shorten (by commenting relevant lines) the following

to this

ashokkrish commented 1 day ago

@ashokkrish, the P-value approach is a little confusing. Which values should I use? I did most of the UI work and set the values correctly for the most part, but I know some of them are the wrong values.

Just committed. Please let me know what to change and I'll get that done tomorrow, which will be a long shift for me so I should have everything done by the end of the day (finally!).

I read this, but I didn't understand it much because its formatting is bad. https://online.stat.psu.edu/statprogram/reviews/statistical-concepts/hypothesis-testing/p-value-approach

@bryce-carson

To calculate the P-value for a Chi-Square test use the following R function

pchisq(TS, df, lower.tail = TRUE) for a left-sided test
2*pchisq(TS, df, lower.tail = TRUE) for a two-sided test and
pchisq(TS, df, lower.tail = FALSE) for a right-sided test

ashokkrish commented 1 day ago

@bryce-carson

If you look at

You will see the full code to plot a Chi-Square distribution, identify AR and RR, locate the TS in red colour. See below for example

You can use the same code base here.

bryce-carson commented 22 hours ago

@ashokkrish, the P-value approach is a little confusing. Which values should I use? I did most of the UI work and set the values correctly for the most part, but I know some of them are the wrong values. Just committed. Please let me know what to change and I'll get that done tomorrow, which will be a long shift for me so I should have everything done by the end of the day (finally!). I read this, but I didn't understand it much because its formatting is bad. https://online.stat.psu.edu/statprogram/reviews/statistical-concepts/hypothesis-testing/p-value-approach

@bryce-carson

To calculate the P-value for a Chi-Square test use the following R function
* `pchisq(TS, df, lower.tail = TRUE)` for a left-sided test

* `2*pchisq(TS, df, lower.tail = TRUE)` for a two-sided test and

* `pchisq(TS, df, lower.tail = FALSE)` for a right-sided test

For the p-value method, which are the correct pValueMethodRelationalOperatorStrings?

      ## Establish the strings to use in MathJax-supported LaTeX for the hypotheses and relations.
      if (input$altHypothesis == 1) {
        nullHypString <- "\\geq";
        altHypString <- "\\lt";
        pValueMethodRelationalOperatorString <- "\\lt";
      } else if (input$altHypothesis == 2) {
        nullHypString <- "=";
        altHypString <- "\\ne";
        pValueMethodRelationalOperatorString <- "\\gt";
        chiSqPValue <- 2*chiSqPValue;
      } else {
        nullHypString <- "\\leq";
        altHypString <- "\\gt";
        pValueMethodRelationalOperatorString <- "\\gt";
      }

bryce-carson commented 22 hours ago

@ashokkrish, please see the last comment.

bryce-carson commented 16 hours ago

@ashokkrish, should the test statistic value be calculated using the commented code or with qchisq? I know the critical value and the p-value are calculated using qchisq and pchisq, respectively.

## chiSqTestStatistic <- sqrt((degreesOfFreedom * input$SSDStdDev^2) /  input$hypStdDeviation^2);
chiSqTestStatistic <- qchisq(SigLvl(), 11, lower.tail = leftTailed);

I was calculating the test statistic value with the commented code previously, but you said the value produced was incorrect.

ashokkrish commented 16 hours ago

@bryce-carson

The test statistic should be calculated using the formula

To calculate the critical value (CV) for a Chi-Square test use the following R function the default lower.tail = TRUE

qchisq(alpha, df) for a left-sided test

qchisq(alpha/2, df)and qchisq(1 - alpha/2, df) for a two-sided test and

qchisq(1 - alpha, df) for a right-sided test

To calculate the P-value for a Chi-Square test use the following R function

pchisq(TS, df, lower.tail = TRUE) for a left-sided test

2*pchisq(TS, df, lower.tail = TRUE) for a two-sided test and

pchisq(TS, df, lower.tail = FALSE) for a right-sided test

ashokkrish commented 16 hours ago

@bryce-carson

You have chiSqTestStatistic <- sqrt((degreesOfFreedom * input$SSDStdDev^2) / input$hypStdDeviation^2);

Drop the square root completely.

bryce-carson commented 2 hours ago

You shorten (by commenting relevant lines) the following

to this

@ashokkrish, aren't we calculating the variance---not the standard deviation---if the square root of the ratio is not calculated? Why did we not notice this earlier? See the quoted comment with the formula for the test statistic, which includes a square root. Why switch? Are we on the same page? When I assigned chiSqTestStatistic <- sqrt((degreesOfFreedom * input$SSDStdDev^2) / input$hypStdDeviation^2);, it is immediately following the formula describing this equation.

@bryce-carson

You have chiSqTestStatistic <- sqrt((degreesOfFreedom * input$SSDStdDev^2) / input$hypStdDeviation^2);

Drop the square root completely.

Let's ensure we're on the same page before I adjust this. We should talk about this at 3PM during our meeting and put this issue to rest and close it after resolving it together, step by step.

ashokkrish / CougarStats

Statistical Inference for One Population Standard Deviation: Hypothesis Testing #33