tapj / biotyper

an R package to biotype a community
13 stars 9 forks source link

Is JSD distance calculated using "no.unassigned"? #2

Closed xmarti6 closed 5 years ago

xmarti6 commented 6 years ago

Hi I'm trying to understand Enterotypes from the tutorial (http://enterotyping.embl.de/enterotypes.html#genustable) and with the code (https://github.com/tapj/biotyper/blob/master/R/BiotypeR.r) but I dont catch whether at the end you use the "unassigned feature" or not when calculating the JSD distance. According to the tutorial "we do not consider it as a feature while estimating the Jensen-Shannon Distance" but looking at the code it seems to me like "dist.JSD" function uses it regardless "no.unassigned" option, I don't know whehter it is a bug or some concept I'm missing.

Many thanks in advance Xavi

tapj commented 6 years ago

Hi Xavi,

Actually we used the unassigned feature to calculate relative abundance but we removed it for JSD computation and Principal coordinate analysis.

I understand that "no.unassigned" option is not clear, it means "would you want to remove unassigned?" I would advise you to remove first unassigned fraction and then use Biotyper function.

This way of making enterotypes is deprecated, I invite you to try the method published by Costea et al:

https://www.nature.com/articles/s41564-017-0072-8

hope that helps,

Julien

xmarti6 commented 6 years ago

Hi Julien, Thanks for pointing out the paper and your recommendation, the problem that I can see with the new approach is that is a web-based service no R-code and thus cannot be used within a local pipeline, am I right? if not, can I download the code somewhere? btw,It would be nice if some kind of note in http://enterotyping.embl.de/index.html could warn that are deprecated in favour of http://enterotypes.org/index.html :)

Best Xavi

tapj commented 6 years ago

Hi Xavi,

indeed the web interface is not the best solution to create your own data workflow. thus, I would recommend you then to use the Dirrchlet multinomial bioconductor package which is actually used in Costea article.

https://bioconductor.org/packages/release/bioc/html/DirichletMultinomial.html

thank you for your feedback about the enterotype website, I will ask them to update information.

best, Julien.

ScaonE commented 2 years ago

Hi Julien, Thanks for pointing out the paper and your recommendation, the problem that I can see with the new approach is that is a web-based service no R-code and thus cannot be used within a local pipeline, am I right? if not, can I download the code somewhere? btw,It would be nice if some kind of note in http://enterotyping.embl.de/index.html could warn that are deprecated in favour of http://enterotypes.org/index.html :)

Best Xavi

Dear all, I totally agree with Xavi here, having what's behind http://enterotypes.org as a standalone version would be a huge plus. Same goes for pointing toward the most recent methods for enterotypes (Costea et al. 2018 & the DirichletMultinomial package).

Best regards