Closed ElySeraidarian closed 3 months ago
Made some changes, I'll add a parameter to increase the number of principal components and maybe one optional parameter to select the name used for the matrix in reducedDim slot if it has been changed by the user? So that I'll be able to recognize the method used
I have added support for more PC components (I have not added a parameter, it plots the number of components given by the matrix but maybe I should add if user wants to plot less?). Also added support if name is changed in reducedDim slot.
Yes, it is necessary to give the number of components. Argument name "ncomponents" has been used elsewhere.
By default it could be ncomponents=5?
Hadn't seen comments I'll check this
For heatmap, "shadow text" might be better
Every change should have been done according to your comments, the values in tree layout is now accurately matching with labels (getting same results with heatmap and barplot). I'll focus on making it operational for every reduction method (detecting which method has been performed) with what @thpralas has done ( and maybe find a solution if user is not aware of these wrappers)
Just for clarity, could you "resolve" all comments that have been addressed?
Solution for non-wrapper users would be to support plain numeric matrices (without tree) in addition to (Tree)SE objects. If users want to include a tree then I think it is better to require that they create a TreeSE object.
I have been able to make it work for LDA method with the wrapper
library(mia)
library(ggtree)
library(scater)
data("GlobalPatterns", package = "mia")
tse <- GlobalPatterns
tse <- transformAssay(tse, method = "clr", pseudocount = 1)
tse <- agglomerateByPrevalence(tse, rank="Phylum", update.tree = TRUE)
tse <- addLDA(tse)
plotLoadings(tse, ncomponents = 2)
However I have not been able to autodetect method if reducedDim slot contains several methods, then user will have to specify name using dimred = "method_name" and I don't believe this is possible either?
COuld you use reducedDimNames(tse)
to detect?
For default layout, I would keep barplot as it is the most simple and fastest to look
I have been able to make it work for LDA method with the wrapper
library(mia) library(ggtree) library(scater) data("GlobalPatterns", package = "mia") tse <- GlobalPatterns tse <- transformAssay(tse, method = "clr", pseudocount = 1) tse <- agglomerateByPrevalence(tse, rank="Phylum", update.tree = TRUE) tse <- addLDA(tse) plotLoadings(tse, ncomponents = 2)
However I have not been able to autodetect method if reducedDim slot contains several methods, then user will have to specify name using dimred = "method_name" and I don't believe this is possible either?
That is ok in my opinion. Other functions also assume that user specifies the reducedDim to plot.
For instance:
plotReducedDim(tse, dimred = "PCA") --> you have to always specify dimred in this function
I am also wondering if current heatmap layout takes full advantage of heatmaps. Heatmaps are create way to visualize large batch of data. However, in this layout it rather makes the plot more complex compared to simpler barplot. Instead, the heatmap could be in this format.
Current heatmap
Barplot with facets
After thinking this, we could have 2 layouts: "barplot" and "heatmap". The we could have hidden parameter "add.tree" that decides whether to add tree to heatmap (the "tree" layout is also one kind of heatmap)
The implementation could be then simplified a lot without losing features. This helps in the maintenance.
Changes have been made, heatmap has been changed and parameter add.tree instead of tree layout
I simplified the code a lot. Check that it still works as expected. Also run BiocCheck::BiocCheck() and fix issues related to this added R file (especially, do not exceed 80 characters in line width). Also use inheritParam tag in the documentation.
Thing to discuss: should rank
be row.var
? --> variable defining a column from rowData. That is what it is doing currently
Thing to discuss: should
rank
berow.var
? --> variable defining a column from rowData. That is what it is doing currently
Yes I believe this would make more sense
Seems good.
But we could also still create a parameter rank that agglomerates tree as in the other functions
But we could also still create a parameter rank that agglomerates tree as in the other functions
If we agglomerate the data, we lose the match between loadings and rows. Loadings are for features of un-agglomerated data --> that is my first impression, check if that is correct
But we could also still create a parameter rank that agglomerates tree as in the other functions
If we agglomerate the data, we lose the match between loadings and rows. Loadings are for features of un-agglomerated data --> that is my first impression, check if that is correct
Yes that's right, agglomeration has to be done before performing the reduction method
1) I removed authors. See the issue: https://github.com/microbiome/mia/issues/628 2) You should perform first agglomeration and then transform the data (fixed) 3) I added a check that the feature names match in tree heatmap
Looks very nice. Are you interested in updating OMA: https://microbiome.github.io/OMA/docs/devel/pages/beta_diversity.html#sec-dbrda-workflow
If I remember correctly, there were some error?
Looks very nice. Are you interested in updating OMA: https://microbiome.github.io/OMA/docs/devel/pages/beta_diversity.html#sec-dbrda-workflow
If I remember correctly, there were some error?
I will update using this function, I found same results when plotting. Also this example shows how to retrieve the loadings (for now we can plot only by getting matrix and then giving the matrix to plot).
According to #92, we need a way to plot feature loadings after performing a reduction method by several ways. Here is a first version with PCA that plots feature loadings in 4 different ways : Tree, Barplot, Screeplot & Heatmap. This will be available for LDA & NMF as soon as it is possible to retrieve loadings matrix from reducedDim slot. There are parameters that can be added so that we can talk about it (for example the number of features plotted). The parameter method is probably doomed to disappear as we will be able to recognize which method has been used in the reducedDim slot. There are examples for NMF & LDA at the end for now without using the function. This will need examples in OMA and strong documentation in order to tell people how to use this function (agglomerate tree if plotting the tree, only plotting few features, ...). It is currently possible to give either loadings matrix or TSE object to plot.
I will add tests & generate documentation also later.