HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
66 stars 30 forks source link

plotDiffHeatmap sort by condition #383

Closed SamWell16 closed 6 months ago

SamWell16 commented 6 months ago

First of all, thank you for this very useful package!

It would be really appreciated if you could help with making some modification in "plotDiffHeatmap". I would like to sort columns by condition instead of "sample_id". in addition, in my case, when the "patient_id" is on, the color for condition is very similar. Is there an option to select the color for conditions.

Fig1_plotDiffHeatmap.pdf

Thanks!

HelenaLC commented 6 months ago
  1. The column orders is determined by the levels(sce$sample_id). So, setting these to match your desired order (by condition) will 'automatically' achieve this.
  2. Regarding custom colors for annotation, please see issue #244 where I posted a minimal reproducible example for this (this bit following the comment # display test results is the crucial bit).
SamWell16 commented 6 months ago

Thanks for your help!

For 1., If I understand correctly, I need to reorder the data frame column which I tried using sce %>% arrange (condition) but I got an error :

Error in SummarizedExperiment:::.SummarizedExperiment.charbound(subset, : index out of bounds: 1 2 ... 312639 312640

Could you suggest another way to simply "order by condition"?

Thanks!

HelenaLC commented 6 months ago

arrange() will reorder the cells, but you need to fix the levels(sce$sample_id) as I said above... Below is a minimal reproducible example for 1) custom annotation colors and 2) fixing the sample order... NOTE: Ideally, samples could be listed in the desired/most meaningful order in the metadata sheet upon construction of the object with prepData(); then, at the very beginning of the workflow, one would set sce$sample_ids to be a factor with levels matching how they are listed in the metadata, and all plots will "respect" that order consistently (not only plotDiffHeatmap()) [see, for example, the F1000Reasearch workflow where we fix both sample and condition order at the very beginning].

# dependencies
suppressPackageStartupMessages({
    library(CATALYST)
    library(ComplexHeatmap)
    library(diffcyt)
})

# construct SCE & run clustering
data(PBMC_fs, PBMC_panel, PBMC_md)
sce <- prepData(PBMC_fs, PBMC_panel, PBMC_md)
sce <- cluster(sce, verbose=FALSE)

# order sample levels by condition
ids <- unique(sce$sample_id)
idx <- match(ids, sce$sample_id)
o <- order(sce$condition[idx])
sce$sample_id <- factor(sce$sample_id, ids[o])

# differential analysis
design <- createDesignMatrix(PBMC_md, cols_design=3)
contrast <- createContrast(c(0, 1))
ds <- diffcyt(sce, design=design, contrast=contrast, 
    analysis_type="DS", method_DS="diffcyt-DS-limma",
    clustering_to_use="meta20", verbose=FALSE)

# display test results
hm <- plotDiffHeatmap(sce, rowData(ds$res), fdr=Inf, top_n=20)
cm <- ColorMapping("condition", c(Ref="black", BCRXL="grey"))
hm@top_annotation@anno_list$condition@color_mapping <- cm
hm@top_annotation@anno_list$condition@fun@var_env$color_mapping <- cm
hm

image

# randomize samples to demo what happens
sce$sample_id <- factor(sce$sample_id, sample(ids))
levels(sce$sample_id) # this'll be the x-axis order now...
plotDiffHeatmap(sce, rowData(ds$res), fdr=Inf, top_n=20)

image

SamWell16 commented 6 months ago

Super helpful! I realized now that I didn't construct my object properly from the beginning and that caused the confusion.

Thanks again!