HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
66 stars 31 forks source link

plotDiffHeatmap sort by condition #383

Closed SamWell16 closed 10 months ago

SamWell16 commented 10 months ago

First of all, thank you for this very useful package!

It would be really appreciated if you could help with making some modification in "plotDiffHeatmap". I would like to sort columns by condition instead of "sample_id". in addition, in my case, when the "patient_id" is on, the color for condition is very similar. Is there an option to select the color for conditions.

Fig1_plotDiffHeatmap.pdf

Thanks!

HelenaLC commented 10 months ago
  1. The column orders is determined by the levels(sce$sample_id). So, setting these to match your desired order (by condition) will 'automatically' achieve this.
  2. Regarding custom colors for annotation, please see issue #244 where I posted a minimal reproducible example for this (this bit following the comment # display test results is the crucial bit).
SamWell16 commented 10 months ago

Thanks for your help!

For 1., If I understand correctly, I need to reorder the data frame column which I tried using sce %>% arrange (condition) but I got an error :

Error in SummarizedExperiment:::.SummarizedExperiment.charbound(subset, : index out of bounds: 1 2 ... 312639 312640

Could you suggest another way to simply "order by condition"?

Thanks!

HelenaLC commented 10 months ago

arrange() will reorder the cells, but you need to fix the levels(sce$sample_id) as I said above... Below is a minimal reproducible example for 1) custom annotation colors and 2) fixing the sample order... NOTE: Ideally, samples could be listed in the desired/most meaningful order in the metadata sheet upon construction of the object with prepData(); then, at the very beginning of the workflow, one would set sce$sample_ids to be a factor with levels matching how they are listed in the metadata, and all plots will "respect" that order consistently (not only plotDiffHeatmap()) [see, for example, the F1000Reasearch workflow where we fix both sample and condition order at the very beginning].

# dependencies
suppressPackageStartupMessages({
    library(CATALYST)
    library(ComplexHeatmap)
    library(diffcyt)
})

# construct SCE & run clustering
data(PBMC_fs, PBMC_panel, PBMC_md)
sce <- prepData(PBMC_fs, PBMC_panel, PBMC_md)
sce <- cluster(sce, verbose=FALSE)

# order sample levels by condition
ids <- unique(sce$sample_id)
idx <- match(ids, sce$sample_id)
o <- order(sce$condition[idx])
sce$sample_id <- factor(sce$sample_id, ids[o])

# differential analysis
design <- createDesignMatrix(PBMC_md, cols_design=3)
contrast <- createContrast(c(0, 1))
ds <- diffcyt(sce, design=design, contrast=contrast, 
    analysis_type="DS", method_DS="diffcyt-DS-limma",
    clustering_to_use="meta20", verbose=FALSE)

# display test results
hm <- plotDiffHeatmap(sce, rowData(ds$res), fdr=Inf, top_n=20)
cm <- ColorMapping("condition", c(Ref="black", BCRXL="grey"))
hm@top_annotation@anno_list$condition@color_mapping <- cm
hm@top_annotation@anno_list$condition@fun@var_env$color_mapping <- cm
hm

image

# randomize samples to demo what happens
sce$sample_id <- factor(sce$sample_id, sample(ids))
levels(sce$sample_id) # this'll be the x-axis order now...
plotDiffHeatmap(sce, rowData(ds$res), fdr=Inf, top_n=20)

image

SamWell16 commented 10 months ago

Super helpful! I realized now that I didn't construct my object properly from the beginning and that caused the confusion.

Thanks again!