Open AndriSignorell opened 1 year ago
This is a conceptual discrepancy. In general, weights for observations do not necessarily correspond to frequency counts. For example, when estimating a conditional Spearman's rho, you would upweight observations in some neighboorhood of the covariate value and downweight all others. The weights can be any positive real number, not just integers, so they can't in general be interpreted as repeated observations. In contrast, the freq
procedure in SAS is specifically intended for frequency tables.
In your specific example, SAS interprets the frequency table as a larger data set with repeated observations (and, hence, many ties). wdm takes the frequency table as just four observations with some weight assigned; there are no ties. For Spearman's rho, we get different a different result because the "mid-rank" is computed differently (ties in SAS vs no ties in wdm). Also, wdm's independence test isn't useful because it is based on an asymptotic approximation with just 4 observations (see the p-values, also for Kendall's tau and Pearson correlation).
I see why the "weight = count" perspective can be useful though and might work this into the package at some point.
Thanks for clarification! The option would indeed be a welcome addition.
Hi Thomas I suppose there's a bug in your spearman code when using weights. From my understanding the following should hold: