Closed lmkirvan closed 6 years ago
Hi - I'm glad you're finding the package useful!
Regarding the NaN - is it possible you have a NA value in phi
? Also - a long shot here - did you mean to write params$phi
rather than parems$phi
, i.e. just a typo?
If you want to share the data to make the error reproducible, that might also help us troubleshoot.
-k
I can save the values of phi if that would be helpful. Let me know and I can send it to you via email, or updload to a github repo. The phi I'm using does not include any NA values and all rows sum to 1. There are several zero values (because of rounding), but I understood that wouldn't be a problem. I think that it's a problem with the distance function as written.
jsPCA2<- function (phi) { jensenShannon2 <- function(x, y) { m <- 0.5 * (x + y) 0.5 * sum(x * log(x/m)) + 0.5 * sum(y * log(y/m)) } dist.mat <- proxy::dist(x = phi, method = 'Jaccard') return(dist.mat) pca.fit <- stats::cmdscale(dist.mat, k = 2) data.frame(x = pca.fit[, 1], y = pca.fit[, 2]) }
As you can see, I edited the jsPCA function and using another distance metric (chosen at random) does not return NaN.
I've also spotted a question on SO that looks like someone is experiencing a similar problem.
http://stackoverflow.com/questions/35830008/r-ldavis-k-2-createjson-error
Let me know if you'd like the phi file.
Thanks for you help.
-L
On Fri, Apr 22, 2016 at 3:07 PM, Kenny Shirley notifications@github.com wrote:
Hi - I'm glad you're finding the package useful!
Regarding the NaN - is it possible you have a NA value in phi? Also - a long shot here - did you mean to write params$phi rather than parems$phi, i.e. just a typo?
If you want to share the data to make the error reproducible, that might also help us troubleshoot.
-k
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/cpsievert/LDAvis/issues/56#issuecomment-213556822
NaN are returned when you have 0 values in phi matrix. That's why you have to add constant to every value in phi matrix, like it is done in tutorial.
Hi Marcin, thanks for the great package. I think the solution should not be to add a constant. The problem appears because R sets 0*log(0) as NaN. But mathematically, the limit of x log(x) for x to 0 is 0. Therefore, the summand in the jensenShannon metric should be 0. For example, you could replace
sum(x * log(x/m))
by
sum(ifelse(x==0,0,x * log(x/m))
Best, Maren
@Maren-Eckhoff that's great solution.
I'm not the owner of the package but @cpsievert is and might would like to know this improvement.
I have encountered the same issue,
Then, I applied the fix mentioned above by @Maren-Eckhoff (thanks!). It works in most cases but fails in some cases as well, returning the error infinite or missing values in 'x'
by the method jsPCA
I really enjoy this package and appreciate your work on it. I've previously used it successfully, but updated the package recently and now get an error that I previously had not encountered.
I can't quit figure out why (as the jensen Shannon distance function looks okay) but
`jensenShannon <- function(x, y) { m <- 0.5_(x + y) 0.5_sum(x_log(x/m)) + 0.5_sum(y*log(y/m)) }
dist.mat <- proxy::dist(x = parems$phi, method = jensenShannon)`
returns Nan using phi.