Closed algebio closed 2 years ago
Hi @algebio,
have you tried plotting the scatter plots individually instead of using lapply
?
Could you try this:
plotScatter(sce, chs, assay = "exprs")
plotScatter(sce, chs, assay = "compexprs")
and see if they differ?
Cheers,
Nils
Hi Nils
Thank you for your help. Unfortunately, the results of both plotScatter are exactly the same. I'm doing something wrong but I can't see where.
Regards Juan
The spillover matrix looks fine to me. Have you also tried to generate the scatter plot without using plotScatter
? Just to make sure that it's not a visualisation issue.
Same.
I also checked the values of sce@assays@data@listData[["exprs"]] and sce@assays@data@listData[["compexprs"]]. They seem exactly the same values.
Hmm, then I don't know where the issue is. Maybe @HelenaLC needs to have a look.
Hi Nils, firstly, thank you for trying to help me. I really appreciate it.
Second. After a long time banging my head against my desk, I found a mismatch in the number of channels because Ir193 is in the single metal spillover matrix but not in the tonsil csv files. I thought that was the problem but then I re-read CATALYST vignette and saw that "columns present in the SM but not in the input data will be removed from it", so that shouldn't be the problem.
Then I saw that the function assignPrelim calls another function .get_ms_from_chs(chs) which is a helper function to get mass and metal from a channel name. This is OK when a channel name is something like Ir193 but not when the names in the channels are of the sort "CD45RA(Pr141Di)" , "CD68(Nd142Di)" (like the ones I'm using) which gave me masses of 45141, 68142, etc.
After changing the names in sce@rowRanges@elementMetadata@listData$channel_name, I managed to compensate my sce object from my tonsil dataset. The problem now is that I am insecure about how reliable my results are because I got my first results (which were not compensated) without any error message. How can I be sure that this second one is correct? Is there any sanity check to be 100% sure that my compensated matrix is correct and I can trust this compensated sce?
Thanks again.
Hi @algebio
I'm glad you figured this out! Getting masses like 45141
looks like a bug to me and should be addressed by the maintainers of CATALYST. Also just as a side note, you should call the channel_name
via rowData(sce)$channel_name
. The sanity check here would be to generate the scatter plots of markers that are affected by spillover (e.g. In113 and In115) and also check that other markers are not being massively corrected (e.g. Eu153 and Sm154). In general, with correct panel design, you will not notice a huge effect of the spillover correction. So just watch out for any obvious issues in the downstream analysis.
"...you will not notice a huge effect of the spillover correction." Well, after checking my spillover matrix after compensation, all red flags are up. I got values as high as 80! Back to square one.
I'm not exactly sure what you mean with "spillover matrix after compensation". The spillover matrix that you generated (beforeCompensation) looks quite similar to the one of the original publication (displayed here). You then use the generated spillover matrix to compensate your actual dataset. The code that you posted originally looks alright to me. So you will need to compare the actual counts of your dataset (not the spillover matrix) before and after compensation.
Hi Nils. Thank you for your latest message, your help has been fundamental for me to finally get this done. I will acknowledge you if we get any publication using compensated imc data.
My last question is just a sanity check to confirm that I did the right thing to compare counts (maybe there is a better way as it happened with channel_name). I used write.csv(assay(sce, "exprs"), "countsBefCompensation.csv") and write.csv(assay(sce, "compexprs"), "countsAfterCompensation.csv"), I got the attached results. If you confirm that everything is OK, I will close this question.
Big Thanks Juan.
Hi Juan, no need to acknowledge me - just wanting to make sure that everything works :) I'd recommend plotting the expression values next to each other as you have done in the original post. To me the counts before compensation don't seem to be asinh transformed. Values like 27 for CD45RA would be far too high for transformed counts.
Spot on, Nils. I had not transformed the data. This is what I get now. Do these plots seem legit to you? They seem too "messy" to me.
Regards Juan
To me they look alright. The Yb173 --> Yb174 spillover is nicely corrected. Also check out Gd157 --> Gd158 and the Indiums.
Glad to hear. I don't have previous experience, so I was expecting something similar to the CD14 / CD16 plot from the CATALYST vignette.
These are the other plots that you mentioned
That looks alright. Maybe still check that there are numerical differences in In113 and In115 before and after compensation. But otherwise I guess this issue is now resolved, right?
It is indeed. Thank you so much for all the help and advice, Nils!
Hi Helena & team
I'm trying to compensate IMC data. I have a set of single colour text files and a set of imc tonsil samples that I would like to compensate. I have used readSCEfromTXT() and some code from imcRtools to create a SCE from the single cell colours and prepData to create a SCE from the tonsil samples.
I have plotted a spillover matrix which shows spillover signal from CD206(Gd157Di) to CD79a(Gd158Di). I have compensated it using nnls method and plotted a "before and after" compensation scatter plot of those two channels showing no differences.
I think I'm missing something but I can't find what it is. Maybe you can spot the mistake in no time. Find below a copy of the code.
Thank you for CATALYST and for your help!
get single-stained control samples
Matrix products: default
locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages: [1] ggplot2_3.3.5 cowplot_1.1.1
[3] pheatmap_1.0.12 CATALYST_1.17.3
[5] flowCore_2.5.0 readxl_1.3.1
[7] imcRtools_0.99.9 SpatialExperiment_1.3.4
[9] SingleCellExperiment_1.15.2 SummarizedExperiment_1.23.5 [11] Biobase_2.53.0 GenomicRanges_1.45.0
[13] GenomeInfoDb_1.29.8 IRanges_2.27.2
[15] S4Vectors_0.31.5 BiocGenerics_0.39.2
[17] MatrixGenerics_1.5.4 matrixStats_0.61.0
loaded via a namespace (and not attached): [1] scattermore_0.7 flowWorkspace_4.5.3
[3] R.methodsS3_1.8.1 tidyr_1.1.4
[5] bit64_4.0.5 irlba_2.3.3
[7] multcomp_1.4-17 DelayedArray_0.19.4
[9] R.utils_2.11.0 data.table_1.14.2
[11] RCurl_1.98-1.5 doParallel_1.0.16
[13] generics_0.1.1 ScaledMatrix_1.1.0
[15] terra_1.4-11 TH.data_1.1-0
[17] proxy_0.4-26 ggpointdensity_0.1.0
[19] bit_4.0.4 tzdb_0.1.2
[21] xml2_1.3.2 httpuv_1.6.3
[23] assertthat_0.2.1 viridis_0.6.2
[25] hms_1.1.1 promises_1.2.0.1
[27] fansi_0.5.0 Rgraphviz_2.37.2
[29] igraph_1.2.7 DBI_1.1.1
[31] htmlwidgets_1.5.4 purrr_0.3.4
[33] ellipsis_0.3.2 dplyr_1.0.7
[35] ggcyto_1.21.0 ggnewscale_0.4.5
[37] ggpubr_0.4.0 backports_1.2.1
[39] cytolib_2.5.3 svgPanZoom_0.3.4
[41] RcppParallel_5.1.4 sparseMatrixStats_1.5.3
[43] vctrs_0.3.8 abind_1.4-5
[45] withr_2.4.2 ggforce_0.3.3
[47] aws.signature_0.6.0 cytomapper_1.5.5
[49] vroom_1.5.5 svglite_2.0.0
[51] cluster_2.1.2 crayon_1.4.1
[53] drc_3.0-1 labeling_0.4.2
[55] edgeR_3.35.3 pkgconfig_2.0.3
[57] units_0.7-2 tweenr_1.0.2
[59] vipor_0.4.5 rlang_0.4.11
[61] lifecycle_1.0.1 sandwich_3.0-1
[63] rsvd_1.0.5 cellranger_1.1.0
[65] polyclip_1.10-0 graph_1.71.2
[67] tiff_0.1-8 Matrix_1.3-4
[69] raster_3.5-2 carData_3.0-4
[71] Rhdf5lib_1.15.2 zoo_1.8-9
[73] base64enc_0.1-3 beeswarm_0.4.0
[75] RTriangle_1.6-0.10 ggridges_0.5.3
[77] GlobalOptions_0.1.2 png_0.1-7
[79] viridisLite_0.4.0 rjson_0.2.20
[81] bitops_1.0-7 shinydashboard_0.7.2
[83] R.oo_1.24.0 ConsensusClusterPlus_1.57.0 [85] KernSmooth_2.23-20 rhdf5filters_1.5.0
[87] DelayedMatrixStats_1.15.4 shape_1.4.6
[89] classInt_0.4-3 stringr_1.4.0
[91] readr_2.0.2 jpeg_0.1-9
[93] rstatix_0.7.0 ggsignif_0.6.3
[95] aws.s3_0.3.21 beachmat_2.9.1
[97] scales_1.1.1 magrittr_2.0.1
[99] plyr_1.8.6 hexbin_1.28.2
[101] zlibbioc_1.39.0 compiler_4.1.1
[103] dqrng_0.3.0 concaveman_1.1.0
[105] RColorBrewer_1.1-2 plotrix_3.8-2
[107] clue_0.3-60 cli_3.0.1
[109] XVector_0.33.0 ncdfFlow_2.39.1
[111] FlowSOM_2.1.24 MASS_7.3-54
[113] tidyselect_1.1.1 stringi_1.7.5
[115] forcats_0.5.1 RProtoBufLib_2.5.1
[117] yaml_2.2.1 BiocSingular_1.9.1
[119] locfit_1.5-9.4 latticeExtra_0.6-29
[121] ggrepel_0.9.1 grid_4.1.1
[123] EBImage_4.35.2 tools_4.1.1
[125] parallel_4.1.1 rio_0.5.27
[127] CytoML_2.5.4 circlize_0.4.13
[129] rstudioapi_0.13 foreach_1.5.1
[131] foreign_0.8-81 gridExtra_2.3
[133] farver_2.1.0 Rtsne_0.15
[135] ggraph_2.0.5 DropletUtils_1.13.4
[137] digest_0.6.28 shiny_1.7.1
[139] Rcpp_1.0.7 car_3.0-11
[141] broom_0.7.9 scuttle_1.3.1
[143] later_1.3.0 httr_1.4.2
[145] sf_1.0-3 ComplexHeatmap_2.9.4
[147] colorspace_2.0-2 XML_3.99-0.8
[149] splines_4.1.1 RBGL_1.69.0
[151] scater_1.21.8 graphlayouts_0.7.1
[153] sp_1.4-5 systemfonts_1.0.3
[155] xtable_1.8-4 jsonlite_1.7.2
[157] tidygraph_1.2.0 R6_2.5.1
[159] pillar_1.6.4 htmltools_0.5.2
[161] mime_0.12 nnls_1.4
[163] glue_1.4.2 fastmap_1.1.0
[165] DT_0.19 BiocParallel_1.27.17
[167] BiocNeighbors_1.11.0 fftwtools_0.9-11
[169] class_7.3-19 codetools_0.2-18
[171] mvtnorm_1.1-3 utf8_1.2.2
[173] lattice_0.20-45 tibble_3.1.5
[175] curl_4.3.2 ggbeeswarm_0.6.0
[177] colorRamps_2.3 gtools_3.9.2
[179] magick_2.7.3 zip_2.2.0
[181] openxlsx_4.2.4 survival_3.2-13
[183] limma_3.49.5 munsell_0.5.0
[185] e1071_1.7-9 GetoptLong_1.0.5
[187] rhdf5_2.37.4 GenomeInfoDbData_1.2.7
[189] iterators_1.0.13 HDF5Array_1.21.0
[191] reshape2_1.4.4 haven_2.4.3
[193] gtable_0.3.0
Rplot3.pdf comp CD79a CD206.pdf