HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
67 stars 30 forks source link

Row annotation in Exprheatmap #336

Closed Ind2022 closed 1 year ago

Ind2022 commented 1 year ago

Dear Catalyst R package group,

Thanks for a great analysis tool. I thought to reach out related to question on heatmaps. I am working with cytoff data and read data with read_steinbock function as a "SPE" object.

And after clustering etc, I am trying to create two heatmaps and I have these 2 questions:

  1. Expr heatmap as row annotation with experimental condition and sample ID but it is only showing sample IDs i.e . Could you please help me understand at what step these row annotation needed to be read in so that when SPE object was created then it will have those as metadata.

  2. Freq heatmap columns as sample IDs which were not included in initial SPE object creation as initial SPE object was created with image IDs but later I want to change those IDs with sample IDs in freq heatmap.

Thanks for your help, Indrani

nilseling commented 1 year ago

Hi @Ind2022

just as a note here that read_steinbock is exported by imcRtools and the resulting SPE object might not contain the necessary entries for visualization using CATALYST. However, if you post a code example we can have a look at this.

You could also have a look at this for visualization options:

https://bodenmillergroup.github.io/IMCDataAnalysis/single-cell-visualization.html

Ind2022 commented 1 year ago

Thanks for your reply. I worked on the colData of SPE object and able to generate one row annotation with expr heatmap but it doesn't show the colors on row annotation. As most of the visualization option is for Single cell object, but I am working with SPE object structure.

spe <- read_steinbock('data/steinbock', return_as = 'spe')

this spe object is further normalized counts and clusters were generated and assigned with Rphenograph

spe object have metadata with sample id and treatment and assigned colors with color vector for sample id & treatment

plotExprHeatmap(spe,k = "cluster_id", features = rownames(spe),bars = F, perc = F, row_clust = T, by = 'sample_id', row_anno = c('treatment','sample_id'))

HelenaLC commented 1 year ago

The SPE class inherits from SCE, so this is not the issue; all visualizations should work on both classes. Perhaps you could also post table(sce$treatment, sce$sample_id)? And, could you clarify what you mean that the heatmap works, but it doesn’t show colors? If the row annotations are not being included it could indicate that there’s something off with the colData, e.g., treatment doesn’t map uniquely to sample_id. But would have to see the data to understand what’s wrong… maybe also posting colData(spe) and just spe would also help.

nilseling commented 1 year ago

I was actually able to reproduce the error with the SPE object available here.

plotExprHeatmap(spe)

resulted in the following image:

image

R version 4.2.3 (2023-03-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.7.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SpatialExperiment_1.8.1     CATALYST_1.22.0             SingleCellExperiment_1.20.1 SummarizedExperiment_1.28.0 Biobase_2.58.0              GenomicRanges_1.50.2       
 [7] GenomeInfoDb_1.34.9         IRanges_2.32.0              S4Vectors_0.36.2            BiocGenerics_0.44.0         MatrixGenerics_1.10.0       matrixStats_0.63.0         

loaded via a namespace (and not attached):
  [1] backports_1.4.1             circlize_0.4.15             drc_3.0-1                   plyr_1.8.8                  igraph_1.4.1                ConsensusClusterPlus_1.62.0
  [7] splines_4.2.3               flowCore_2.10.0             BiocParallel_1.32.6         ggplot2_3.4.1               scater_1.26.1               TH.data_1.1-1              
 [13] digest_0.6.31               foreach_1.5.2               magick_2.7.4                viridis_0.6.2               fansi_1.0.4                 magrittr_2.0.3             
 [19] ScaledMatrix_1.6.0          cluster_2.1.4               doParallel_1.0.17           limma_3.54.2                ComplexHeatmap_2.14.0       R.utils_2.12.2             
 [25] sandwich_3.0-2              cytolib_2.10.1              colorspace_2.1-0            ggrepel_0.9.3               dplyr_1.1.1                 crayon_1.5.2               
 [31] RCurl_1.98-1.10             survival_3.5-5              zoo_1.8-11                  iterators_1.0.14            glue_1.6.2                  polyclip_1.10-4            
 [37] gtable_0.3.3                nnls_1.4                    zlibbioc_1.44.0             XVector_0.38.0              GetoptLong_1.0.5            DelayedArray_0.24.0        
 [43] car_3.1-1                   BiocSingular_1.14.0         DropletUtils_1.18.1         Rhdf5lib_1.20.0             shape_1.4.6                 HDF5Array_1.26.0           
 [49] abind_1.4-5                 scales_1.2.1                mvtnorm_1.1-3               edgeR_3.40.2                rstatix_0.7.2               Rcpp_1.0.10                
 [55] plotrix_3.8-2               viridisLite_0.4.1           clue_0.3-64                 dqrng_0.3.0                 rsvd_1.0.5                  FlowSOM_2.6.0              
 [61] RColorBrewer_1.1-3          pkgconfig_2.0.3             XML_3.99-0.14               R.methodsS3_1.8.2           farver_2.1.1                scuttle_1.8.4              
 [67] locfit_1.5-9.7              utf8_1.2.3                  tidyselect_1.2.0            rlang_1.1.0                 reshape2_1.4.4              munsell_0.5.0              
 [73] tools_4.2.3                 cli_3.6.1                   generics_0.1.3              broom_1.0.4                 ggridges_0.5.4              stringr_1.5.0              
 [79] purrr_1.0.1                 sparseMatrixStats_1.10.0    R.oo_1.25.0                 compiler_4.2.3              rstudioapi_0.14             beeswarm_0.4.0             
 [85] png_0.1-8                   ggsignif_0.6.4              tibble_3.2.1                tweenr_2.0.2.9000           stringi_1.7.12              lattice_0.20-45            
 [91] Matrix_1.5-3                vctrs_0.6.1                 pillar_1.9.0                lifecycle_1.0.3             rhdf5filters_1.10.1         GlobalOptions_0.1.2        
 [97] BiocNeighbors_1.16.0        data.table_1.14.8           cowplot_1.1.1               bitops_1.0-7                irlba_2.3.5.1               colorRamps_2.3.1           
[103] R6_2.5.1                    gridExtra_2.3               RProtoBufLib_2.10.0         vipor_0.4.5                 codetools_0.2-19            MASS_7.3-58.3              
[109] gtools_3.9.4                rhdf5_2.42.0                rjson_0.2.21                withr_2.5.0                 multcomp_1.4-23             GenomeInfoDbData_1.2.9     
[115] parallel_4.2.3              grid_4.2.3                  beachmat_2.14.0             tidyr_1.3.0                 DelayedMatrixStats_1.20.0   carData_3.0-5              
[121] Cairo_1.6-0                 Rtsne_0.16                  ggpubr_0.6.0                ggnewscale_0.4.8            ggforce_0.4.1               ggbeeswarm_0.7.1 
Ind2022 commented 1 year ago

Thank you for reproducing the error, yes this is exactly what I see when I generate expression heatmap with my data. As you see it shows the row annotation color in legend but it doesn't show in the row annotation actually.

HelenaLC commented 1 year ago

Hm, super curious. But it seams this is caused by sample_ids not being factors, but of type character. This solved it: ... I will do a fix to assure sample_ids are factorized if they aren't already (should be in with the next Bioc release... current branch has been frozen).

library(CATALYST)
spe <- readRDS("~/Downloads/spe.rds")
spe$sample_id <- factor(spe$sample_id)
plotExprHeatmap(spe)

image

SamGG commented 1 year ago

Great. I already encounter problem where strings were not directly transformed into factors. It might be something related to R 4.x or a library, but I can't remember...

Ind2022 commented 1 year ago

Thank you for your help. I was able to generate color coding for the treatment in the heatmap. But it doesn't show the sample id i.e it doesn't show legend or row annotation with sample id eventhough code is asking to draw this as seen below.

plotExprHeatmap(spe,k = "cluster_id", features = rownames(spe),bars = F, perc = F, row_clust = T, by = 'sample_id', row_anno = c('treatment','sample_id'))

Ind2022 commented 1 year ago

Thank you, I am actually able to get the sample id on this heatmap now. So now both treatment and sample id are showing up with color code at legend.

SamGG commented 1 year ago

Found a post concerning the change of strings to factors https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/

nilseling commented 1 year ago

Yeah, they changed the stringsAsFactors default in R4.0 - I usually avoid using factors unless needed, in the past they caused more harm than being worth. For plotting functions I internally convert to factors if need be.

HelenaLC commented 1 year ago

Thanks all. Yeah, it's using here just to extract unique sample identifiers. But I fixed it, and @nilseling example works fine now. Re @Ind2022 I'd need more details if the above fix didn't resolve it... I don't really understand what the issue is now, and cannot reproduce it from the information provided.

Ind2022 commented 1 year ago

Thank you @HelenaLC . I am able to modify the expression heatmap plot and it is now showing all row annotation. I will reach out to you all if I need any other help with heatmaps.

I really appreciate your and other prompt reply with this issue. Thank you so much.