audeering / audpsychometric

Analyse rater annotations
Other
0 stars 0 forks source link

Rename rater_agreement_pearson()? #15

Closed hagenw closed 3 weeks ago

hagenw commented 3 weeks ago

In https://github.com/audeering/audpsychometric/pull/13#pullrequestreview-2260333453 there was a brief discussion what audpsychometric.rater_agreement_pearson() actual measures, e.g. if it is more related to agreement associated with a gold standard, or if it measures rater reliability.

At the moment it does not fit into one of those categories, as it does not return an agreement value per stimulus, nor does it return a single reliability score, but an agreement values per rater.

@ChristianGeng do you think we should still rename it, or would it be fine to stay with the current name?

ChristianGeng commented 3 weeks ago

In #13 (review) there was a brief discussion what audpsychometric.rater_agreement_pearson() actual measures, e.g. if it is more related to agreement associated with a gold standard, or if it measures rater reliability.

At the moment it does not fit into one of those categories, as it does not return an agreement value per stimulus, nor does it return a single reliability score, but an agreement values per rater.

@ChristianGeng do you think we should still rename it, or would it be fine to stay with the current name?

I had somehow forgotten about it - it is only used by the ewe, and one might have quickly forgotten to put it into the public api. As the module structure is flattened now, I would not talk about renaming it for these reasons any longer.

However there might be a different aspect that might want me to rename it: I think that one might want to base rater agreement and therefore also ewe on a different association measure than Pearson's at some stage. For example, for sick behaving data as in health situations for instance, might one not think about using Kendall's tau, or Spearman's rho or something even more exotic in the context of the same api?

Would it not be more general to implement it like this:

def rater_agreement(
    ratings: typing.Sequence,
    *,
    axis: int = 1,
   association_type="pearson"
) -> np.ndarray:

Are Kendall's tau and Spearman's rho part of audmetric at all?