Open dselivanov opened 6 years ago
With respect to the Jensen Shannon divergence I think that the fix proposed by Maren-Eckhoff and pending as open pull request already solves the problem. See adapted function and test below.
There was one last comment in above mentioned issue 56 about still getting NaN, however, without providing an example. At least to my understanding, there should be no NaNs as far as the input data is fine - which it should be at this point. (please correct me if I am wrong)
#adapted jensenShannon
jensenShannon <- function(x, y) {
m <- 0.5*(x + y)
#introduced fix proposed by Maren-Eckhoff to avoid log(0)
#https://github.com/cpsievert/LDAvis/issues/56
0.5*(sum(ifelse(x==0,0,x*log(x/m)))+sum(ifelse(y==0,0,y*log(y/m))))
}
#create phi for testing
p <- c(0.25, 0, 0.25, 0,0.5)
q <- c( 0,0.25, 0.25, 0,0.5)
zeros <- c( 0, 0, 0, 0, 0) #this does not make sense, since row should some up to one, just for demo
phi <- rbind(p, q, qrev = rev(q), prev = rev(p), zeros)
# [,1] [,2] [,3] [,4] [,5]
# p 0.25 0.00 0.25 0.00 0.50
# q 0.00 0.25 0.25 0.00 0.50
# qrev 0.50 0.00 0.25 0.25 0.00
# prev 0.50 0.00 0.25 0.00 0.25
# zeros 0.00 0.00 0.00 0.00 0.00
dist.mat <- proxy::dist(x = phi, method = jensenShannon)
pca.fit <- stats::cmdscale(dist.mat, k = 2)
# [,1] [,2]
# p 4.600278e-02 -0.1037688
# q 2.600304e-01 -0.0176260
# qrev -2.600304e-01 -0.0176260
# prev -4.600278e-02 -0.1037688
# zeros 2.073058e-16 0.2427896
True, but
Maybe my comment was misleading, sorry. I agree that LDAvis will have to be reimplemented, just wanted to confirm that the fix works for this purpose. Hence, in the first step a modified copy of createJSON might quickly solve the issues raised above in terms of creating the data for visualization. Another thing is, of course, the potential reimplementation of visualization itself.
Seems that LDAvis package doesn't actively maintained and won't be updated on CRAN in near future. In particular we need option to not reorder topics and fixes for NaN in
jensenShannon
(see https://github.com/cpsievert/LDAvis/issues/56):