hneth / riskyr

A toolbox for rendering risk literacy more transparent
19 stars 1 forks source link

Pass frequencies (instead of probabilities) to objects #17

Open hneth opened 6 years ago

hneth commented 6 years ago

To define a riskyr object, we currently pass 3 essential probabilities (prev, sens, spec). However, we also have functions translating from probabilities into frequencies (and vice versa). Hence, why not allow passing any of these to define a riskyr object?

hneth commented 6 years ago

Partially realized (thanks Nico!), but still awaiting thorough testing and refinement.

hneth commented 6 years ago

Let's generalize this issue to accommodate additional types of scenarios (beyond the primary case of test results in medical diagnostics):

Defining a scenario

To represent any m x n contingency table, a user could define

  1. the name and levels of an x-dimension (columns) and
  2. the name and levels of a y-dimension (rows).

For instance, a possible x-dimension could be "treatment" (with levels "medication", "placebo") and a corresponding y-dimension could be "outcome" (with levels "cured", "not cured").

To initialize a scenario, a user would either have to provide a (vector or matrix) of all m x n frequencies, or a sufficient set of essential probabilities (and either provide or compute a suitable population size N).

Perspectives on a scenario

Distinguish between 3 different perspectives (binary splits):

  1. by columns (sums across the x dimension);
  2. by rows (sums across the y dimension);
  3. by diagonals (sums of 00+11 vs. 01+10 cells, in the 2x2 case).

In the diagnostic case, these perspectives are mapped onto the semantics of

  1. by condition;
  2. by decision;
  3. by correspondence of decision to decision (i.e., accuracy).

Contingency table as output

Given this basic input structure for scenarios (of various types), we include printing a (minimal vs. full) contingency table as an output (e.g., a function print_ct). The summary function could then call this function by default as part of its output.

Overall, these changes would merely affect some variable labels and the semantics (e.g., names and colors) of some frequencies and probabilities. While preserving everything implemented so far, the changes would allow for representing a much wider range of possible scenarios (including cases of evaluating treatments, prevention, and causality).

hneth commented 3 years ago

Rethink & generalize

The ideas mentioned in this issue are still good and valid, but should be generalized further:

Essentially, we should implement the steps of the matrix lens model (i.e., filtering, framing, focusing, and presenting, see Neth et al., 2021, doi 10.3389/fpsyg.2020.567817), starting either (a) from raw data or (b) from a description (and corresponding simulation).

hneth commented 3 years ago

(Reopened this issue, as it hasn't been resolved yet.)