pitakakariki / simr

Power Analysis of Generalised Linear Mixed Models by Simulation
68 stars 19 forks source link

Unable to plot powerCurve #187

Open weijiexu-charlie opened 3 years ago

weijiexu-charlie commented 3 years ago

Hi,

My code is as follows:

fit2 <- extend(fit, along='Worker.id', n=100) 
powerSim(fit2, test=fixed('X1:X2', 'lr'), nsim=1000)
pc2 <- powerCurve(fit2, fixed('X1:X2', 'lr'), along='Worker.id', nsim=20)

The powerSim function works well, but the powerCurve function gives me a quite strange result for print(pc2):

by largest value of Worker.id:
    100: 30.00% (11.89, 54.28) - 53 rows
     20: 20.00% ( 5.73, 43.66) - 271 rows
     30: 45.00% (23.06, 68.47) - 485 rows
      4: 45.00% (23.06, 68.47) - 680 rows
      5: 65.00% (40.78, 84.61) - 895 rows
      6: 70.00% (45.72, 88.11) - 1110 rows
      7: 75.00% (50.90, 91.34) - 1327 rows
     79: 85.00% (62.11, 96.79) - 1520 rows
     89: 90.00% (68.30, 98.77) - 1738 rows
     99: 90.00% (68.30, 98.77) - 1954 rows

The Worker.id here is not monotonic, which may be the reason that plot(pc2) cannot return a plot as expected.

Also, here's what the summary(pc2) function returns, which looks quite normal:

   nrow nlevels successes trials mean      lower     upper
1    53       3         5     20 0.25 0.08657147 0.4910459
2   271      14         5     20 0.25 0.08657147 0.4910459
3   485      25         8     20 0.40 0.19119006 0.6394574
4   680      35         6     20 0.30 0.11893159 0.5427892
5   895      46         9     20 0.45 0.23057790 0.6847219
6  1110      57        15     20 0.75 0.50895413 0.9134285
7  1327      68        15     20 0.75 0.50895413 0.9134285
8  1520      78        15     20 0.75 0.50895413 0.9134285
9  1738      89        16     20 0.80 0.56338600 0.9426660
10 1954     100        18     20 0.90 0.68301729 0.9876515

Is there any possible reason for the weird results of print(pc2) and plot(pc2)?

Thanks!

pitakakariki commented 3 years ago

It's getting confused because Worker.id is a character vector, rather than a factor or a number - note the lexicographic order in the print statement.

If you make your id a factor it should work normally.

pitakakariki commented 3 years ago

Note that there might be deeper problems here. How many times in the codebase do I assume something is either a number or a factor?

Need to fix the immediate problem for print and plot, but also add plenty of test to the unit test suite to make sure all three kinds of grouping variable work.