biocore / qurro

Visualize differentially ranked features (taxa, metabolites, ...) and their log-ratios across samples
https://biocore.github.io/qurro
BSD 3-Clause "New" or "Revised" License
31 stars 10 forks source link

Visualizing phylogenetic beta-diversity with Qurro #248

Closed Jigyasa3 closed 4 years ago

Jigyasa3 commented 4 years ago

Hey!

I have metagenome data with phylogenetic relation between host samples. To account for compositional data, I converted the relative abundance information to AITCHISON distance using DEICODE.

To account for phylogenetic distance, I calculated phylogenetic beta-diversity on distance matrix obtained from deicode run. This was done using comdistnt function in the Picante R package. I wanted to ask if it makes sense to use this phylogenetic beta-diversity in the Qurro package for analyzing differential ranking of bacterial taxa?

The next question is how? Qurro takes ordination.txt file as input rather than the distance matrix file.

Thanks again!

fedarko commented 4 years ago

Hi @Jigyasa3. I think it makes sense to answer your questions in reverse order.

The next question is how? Qurro takes ordination.txt file as input rather than the distance matrix file.

The reason Qurro takes in an "ordination" rather than a distance matrix is that Qurro doesn't know how to use any of the information in a distance matrix. Running DEICODE gives you two outputs, an ordination and a distance matrix -- and all that Qurro uses from DEICODE's output are the feature loadings within the ordination. (These feature loadings are the "rankings" used for the rank plot in Qurro.)

I wanted to ask if it makes sense to use this phylogenetic beta-diversity in the Qurro package for analyzing differential ranking of bacterial taxa?

I don't think it makes sense. I've never used Picante or comdistnt before, but from what I can tell comdistnt just gives you a distance matrix as output. Since you don't have information on how features are associated with variation in this distance matrix (i.e. feature loadings), I don't believe it's possible (at least right now) to use comdistnt's output with Qurro -- you would need to use a beta-diversity method that provides you with feature loadings. If you don't have these loadings, you don't have anything to rank features by. (If you just want to try out computing the log-ratios of features in your dataset without using rankings, that's actually in the works for Qurro... however, it doesn't sound like that is what you were looking for.)

Of course, you could totally load the DEICODE ordination into Qurro (i.e. the outputs from before you ran comdistnt) -- but this wouldn't include any of the "phylogenetic" information that comdistnt gives you, I suppose. (That being said, I think it's worth noting that you can pass in taxonomic information to Qurro as "feature metadata" alongside a DEICODE ordination -- this'll let you interactively search features by their classified taxonomies, and see how taxa are ranked for a given feature loading axis.)

One last thing

You might already know this, but just to be clear: since you do have a distance matrix from comdistnt, I think you should be able to load it into QIIME 2 and then visualize it through a PCoA in Emperor. A workflow like

  1. import the comdistnt distance matrix into QIIME 2
  2. run qiime diversity pcoa on the imported distance matrix -- this will give you a PCoAResults artifact
  3. run qiime emperor plot on the PCoAResults artifact, to get a fancy interactive visualization of your distance matrix using Emperor

would at least show you how your samples vary.

Hope this helps, and let me know if you have any more questions!

Jigyasa3 commented 4 years ago

Thank you so much @fedarko for a detailed reply! I didn't know that the Emperor could take a distance matrix created outside of QIIME 2. I will use that for visualization. I will also try running the sample taxonomy as "feature loading" to get bacterial taxa ranking as you suggested. Thank you so much for the advice!

fedarko commented 4 years ago

I didn't know that the Emperor could take a distance matrix created outside of QIIME 2. I will use that for visualization.

Glad I could help! This thread might be helpful if you run into problems trying to import the distance matrix into QIIME 2, also.

I will also try running the sample taxonomy as "feature loading" to get bacterial taxa ranking as you suggested.

I don't know what you mean by this sentence, sorry. (I don't know what you mean by "sample taxonomy.") How do you plan to produce feature rankings?

Thank you so much for the advice!

No problem!

Jigyasa3 commented 4 years ago

hey @fedarko

Thanks, I was able to get an Emperor pcoa for comdistnt matrix! I have a follow-up question. Is it possible to condense (or collapse) the deicode data-matrix back into features but keeping the log-ratio transformations and matrix-filling? For any of the downstream phylogenetic analysis, the functions require the data to be in the "original" format.

fedarko commented 4 years ago

Hi @Jigyasa3 -- not sure if that'd be possible. I think it might be doable by modifying DEICODE's code to just return opt.solution from around here in the code, but you would have to check with the DEICODE development team (e.g. @cameronmartino) to see if that's actually a reasonable way of doing that. I'd suggest opening a separate issue over on DEICODE's repository, since this isn't really a Qurro problem at this point :)

fedarko commented 4 years ago

I'm going to close this issue for now, but if you have any other questions about Qurro feel free to reopen this or open a new issue!