hneth / riskyr

A toolbox for rendering risk literacy more transparent
19 stars 1 forks source link

Distinguish between 2 types of scaling #23

Closed hneth closed 5 years ago

hneth commented 6 years ago

When plotting frequencies as graphical objects (lines, boxes, or squares), their dimensions can be scaled by magnitude (e.g., plot_fnet with area = "sq", or the new plot_bar function). When rounding frequencies to integers (as per default), the scaled graph may divert from the underlying probabilities (especially for small population sizes N). In the extreme, small frequencies may be rounded to 0 and disappear from plots.

To control this effect, introduce a scale option that defines whether objects are scaled by (rounded or non-rounded) frequencies or by (exact) probabilities. (See plot_bar for a first implementation and generalize to other plots.)

hneth commented 5 years ago

The latest batch of plotting functions (e.g., plot_area, plot_bar, plot_prism, and plot_tab) now use a scale argument to distinguish between 2 options:

  1. scale = "p": scaling by (exact) probabilities, or

  2. scale = "f": scaling by (rounded or non-rounded) frequencies.

When scaling by frequencies, both the dimensions of areas in the plot and the probabilities shown in the plot are computed from current frequencies, even when exact probabilities are provided.

For most plots, the consequences of the 2 scaling options are negligible or small, but when using rounded frequencies (i.e., round = TRUE) and for small population sizes (low values of N) the differences can be substantial.