Derek-Jones / ESEUR-code-data

Code and data used to create the examples in "Evidence-based Software Engineering based on the publicly available data"
http://www.knosof.co.uk/ESEUR/
411 stars 46 forks source link

p = .05 standard is 1.96 sigma, not >2 #10

Closed Deleetdk closed 4 years ago

Deleetdk commented 4 years ago

Page. 256

Screenshot from 2020-05-22 16-01-24

But this is false. The standard used normally is two-tailed test at .05, and this is sigma 1.96, not >2.

> pnorm(-1.96)
[1] 0.0249979
> pnorm(1.96, lower.tail = F)
[1] 0.0249979

These sum to ~5%. I also don't think higher impact journals generally have stricter standards. There is a review of studies of journal impact factor, and journal scientific rigor, finding generally no relationship. https://www.frontiersin.org/articles/10.3389/fnhum.2018.00037/full

Derek-Jones commented 4 years ago

change: greater -> less and make it clear this is a one-sided test.

I have seen some cognitive psychology journals wanting 0.01 (a quick search does not locate any, or at least any listing p-value requirements).

The poor correlation between journal impact factor and connection of the articles it publishes with reality is covered in the introduction chapter.

Deleetdk commented 4 years ago

Maybe you are thinking of this proposal paper to lower threshold to .005. Has not been adopted by any journal as far as I know. https://www.nature.com/articles/s41562-017-0189-z already has 1000 citations, must be great for these authors' career.

Closest to adoption is this I think https://amstat.tandfonline.com/doi/full/10.1080/00031305.2019.1583913

Derek-Jones commented 4 years ago

The proposal does not specify what field the 0.005 would apply within. Other fields (e.g., particle physics) already use a much lower value.

I am trying to promote p-value as one component of a risk model, as least in the commercial world. A p-value of 0.5 might be sustainable in some circumstances.

Derek-Jones commented 4 years ago

Fixed.