joaquinanguera / aceR

An R package for processing ACE data
MIT License
3 stars 14 forks source link

Calculation of d' #32

Closed mattminder closed 3 years ago

mattminder commented 3 years ago

Hello,

I think I've found some bugs in the calculation of the d' value in the file aceR/R/math-detection.R.

In the paper "Calculation of signal detection theory measures" (Stanislaw 1999), hit rate H and false alarm rate F are defined as:

Meanwhile, lines 36 and 37 of aceR/R/math-detection.R define the false alarm rate and hit rate differently: out$false_alarm_rate = freq$false_alarm / num_nontargets out$hit_rate = freq$hit / num_nontargets This different definition of the hit rate (division by the number of nontargets instead of the number of targets) will yield a different d' value.

Furthermore, the function ace_dprime uses yet another definition of the false alarm rate and hit rate, there the hit rate is defined as the number of hits divided by the total number of trials, and similarly the false alarm rate is the number of false alarms divided by the total number of trials, which will give yet another d'.

Is this a bug or is there something I'm not seeing?

Thanks :)

monicathieu commented 3 years ago

Hi Matt,

Thanks for looking through this! It looks like there are indeed some bugs(!), as well as some places I can document better and eventually delete very old, deprecated code.

  1. ace_detection() is long deprecated (some of these are remnants from a verrrry old, like circa 2015, version of this package). I haven't deleted this function and its calling function, ace_detection_rate(), purely out of fear that something will break, but this code truly has no use anymore so it's best deleted to clean up the package.
  2. ace_dprime() and ace_dprime_wide() in that same file are used--the former for SAAT and TNT data, and the latter for Filter. I think you're right that ace_dprime() is currently lumping target and non-target trials together in the denominator of hit and FA rates, by counting the entire length of the column instead of only the length of the target and non-target portions. And because qnorm() is a nonlinear transformation, the resulting d' will be off, and in general be lower than it should be. (I think rank-order relationships between participants should be preserved, but the d' is off!!)

This is pretty serious but also easy to fix so I will patch this ASAP. Thank you for catching this! The more eyeballs the better.

mattminder commented 3 years ago

Perfect, thanks for the quick response!