Dealing with weak networks

hturner / PlackettLuce

PlackettLuce package for Plackett-Luce models in R

18 stars 5 forks source link

library("PlackettLuce") source("https://raw.githubusercontent.com/AgrDataSci/ClimMob-analysis/master/R/functions.R") R <- matrix(c(1, 2, 0, 0, 3, 4, 1, 0, 0, 2, 2, 1, 0, 0, 3, 1, 2, 0, 4, 3, 2, 1, 0, 3, 4, 4, 1, 0, 0, 2, 2, 1, 0, 0, 3, 1, 2, 0, 1, 3, 2, 0, 0, 0, 1, 0, 0, 0, 1, 2), nrow = 10, byrow = TRUE) colnames(R) <- c("apple", "banana", "orange", "pear", "grape") R <- as.rankings(R) # take rows 9 and 10 supposing that it belongs to a different fold in a # cross-validation R <- R[-c(9:10), ] G <- group(R, index = 1:length(R)) p <- data.frame(p = rep(1, length(G))) dt <- cbind(G, p) pl <- pltree(G ~ p, data = dt) # it does not work as shown in issue #25 predict(pl, newdata = dt) AIC(pl, newdata = dt) # but works with vcov = FALSE for predict() predict(pl, newdata = dt, vcov = FALSE) # and still dont work for AIC AIC(pl, newdata = dt, vcov = FALSE) # this because orange got off of the network when we sampled the folds a <- adjacency(R) plot(network(a)) # the issue still persists even if we increase npseudo pl2 <- pltree(G ~ p, data = dt, npseudo = 0.8)

Thanks for digging down to find the cause of this issue.

The addition of pseudo rankings allows the worth to be estimated, but these pseudo rankings are removed before estimating the variance-covariance matrix. If an item is then completely missing from the rankings this leads to zero rows and columns in the Information matrix which makes it non-invertible, so the variance can't be estimated. I am not sure what the appropriate fix should be here but will follow this up (it may be a few months before I get to it as prioritising work on PLADMM in May/June).

AIC.pltree() doesn't need to compute the variance-covariance matrix, that was throwing an error due to a call to itempar() which defaults to vcov = TRUE. I have replaced this call and made a PR to the master branch; once that's merged in AIC(pl, newdata = dt) should work if you install the package from GitHub. However as newdata is actually the original data used in the fit here, it would be better to simply call AIC(pl) which avoids even more unnecessary computation and should work with the current PlackettLuce release (0.4.0). (This also goes for the call to predict - better not to specify newdata unless you are specifying data that is different from the data used in the fit!)

hturner / PlackettLuce

Dealing with weak networks #50