Closed achetverikov closed 1 year ago
Hello,
To suberb
, the input data frame is wide_data
which has four columns,
subj E_E E_F F_E F_F
1: 1 -0.44449237 -0.03235123 0.33117911 0.24256811
2: 2 -0.16501937 0.06406321 -0.15657197 0.20675888
3: 3 -0.23265304 -0.01147290 0.07758697 0.31571311
4: 4 -0.52565753 0.17656301 -0.06597765 0.57736573
5: 5 0.19321119 0.10167027 -0.14578568 -0.01145581
6: 6 -0.14903764 0.40485607 0.08435495 -0.23097020
7: 7 0.07729921 0.22796934 -0.03298668 -0.09575244
8: 8 -0.07237976 0.13240136 0.05019913 -0.01337560
9: 9 0.25639642 -0.01419813 -0.11931754 -0.24855318
10: 10 -0.26935239 -0.35479730 0.10312231 0.08981053
The columns are labelled with the first index E
changing less frequently than the second index E
, then F
. Hence, the correct instruction necessitate to place the factor B
first:
superbData(wide_data, WSFactors = c('B(2)','A(2)'), # inverted A and B here
variables = colnames(wide_data[,2:5]))$summaryStatistics
Because in within-subject design, column names contain no information as to how to interpret the levels, superb
adds a message (FYI
) to make sure that they are interpreted as you desired:
superb::FYI: Here is how the within-subject variables are understood:
B A variable
1 1 E_E
2 1 E_F
1 2 F_E
2 2 F_F
In your example, the confusion comes from the fact that the dcast
function cycles through all the levels of the second factor first. This choice is arbitrary and other reformating functions adopted the other ordering (e.g., lsr
, Navarro) which is the convention adopted in superb
. From the wide format data structure, it is not possible to know how it was generated. Hence, any information you might have in data
cannot be known.
Alternatively, you can specify manually how to interpret all the columns with WSDesign
:
superbData(wide_data, WSFactors = c('A(2)','B(2)'),
variables = colnames(wide_data[,2:5]),
WSDesign = list( a1b1=c(1,1), a1b2=c(1,2), a2b1=c(2,1), a2b2=c(2,2) )
)$summaryStatistics
to which the FYI
message is
superb::FYI: Here is how the within-subject variables are understood:
A B variable
1 1 E_E
1 2 E_F
2 1 F_E
2 2 F_F
The four names in the WSDesign
list (e.g., a1b1
) are arbitrary, but their order matches the columns of wide_data
.
"A related problem is that the factor labels are not preserved." So is the case with wide_data
and colMeans(wide_data[,2:5])
.
Hope it helps!
OK, thanks! This is indeed helpful.
Factors levels in the outputs are not sorted properly.
Gives:
So A==1 & B == 2 now correspond to F_E instead of E_F.
A related problem is that the factor labels are not preserved.