jeffreypullin / rater

R package to fit statistical models to repeated categorical rating data using Stan
https://jeffreypullin.github.io/rater
GNU General Public License v2.0
18 stars 3 forks source link

Passing "true" class labels as prior for selected z parameters #184

Open haukelicht opened 11 months ago

haukelicht commented 11 months ago

Thank you for providing this awesome package!

In the "Workflow" vignette (line 103), you note that '{rater} also supports ... setting (some of) the prior parameters ...'.

How can I implement priors for the "true" category of some of the items? I have "ground truth" labels for a subset of the items in my use case and I want to "pull" the model's $z$ estimates towards these values during inference.

Take, for example, the ratings of item 12 in the anesthesia dataset:

> library(rater)
> data("anesthesia")
> anesthesia[anesthesia$item == 12, ]
   item rater rating
78   12     1      2
79   12     1      2
80   12     1      2
81   12     2      3
82   12     3      3
83   12     4      4
84   12     5      3

What if I know that the "ground truth" rating for this item is 3?

Thank you for your help!

jeffreypullin commented 11 months ago

Hi Hauke,

Thanks for your interest in rater!

What I would recommend is creating a new 'ground truth' rater in your data for the items you have ground truth for, and then specifying the prior for that rater to encode that it is very accurate (i.e. has large on-diagonal entries in it's prior parameter matrix).

In code something like:

library(rater)

# We have 'ground truth' ratings for the first three patients.
anesthesia_w_ground_truth <- anesthesia
anesthesia_w_ground_truth[anesthesia$item %in% 1:3, "rating"] <- 1
anesthesia_w_ground_truth[anesthesia$item %in% 1:3, "rater"] <- 6

# Taken from the rater() function's code.
J <- 6
K <- 4

N <- 8
p <- 0.6
on_diag <- N * p
off_diag <- N * (1 - p) / (K - 1)

beta_slice <- matrix(off_diag, nrow = K, ncol = K)
diag(beta_slice) <- on_diag

beta <- array(dim = c(J,K,K))
for (j in 1:5) {
  beta[j, , ] <- beta_slice
}

beta_slice_ground_truth <- matrix(off_diag, nrow = K, ncol = K)
# This value may require tweaking.
diag(beta_slice_ground_truth) <- 15
beta[6, , ] <- beta_slice_ground_truth

fit_w_ground_truth <- rater(anesthesia_w_ground_truth,
                            dawid_skene(beta = beta))
fit <- rater(anesthesia, "dawid_skene")

plot(fit, "theta")
plot(fit_w_ground_truth, "theta")

A disclaimer however, I have not used this technique in a real data analysis.

Hope that helps, let me know how you get on!

Cheers, Jeffrey