RGLab / openCyto

A package that provides data analysis pipeline for flow cytometry.
GNU Affero General Public License v3.0
75 stars 29 forks source link

gt_gating cannot detect channels of interest #219

Closed wjs20 closed 3 years ago

wjs20 commented 3 years ago

The problem: I want to gate on the APC-Cy7-A and Brilliant Violet 785-A dimensions in my fcs files, however the gt_gating function does not seem able to detect them. They appear to be present in the flowset as I get this readout when I call read.ncdfFlowSet All FCS files have the same following channels: FSC-A FSC-H FSC-W SSC-A SSC-H SSC-W FITC-A PerCP-Cy5-5-A Cascade Blue-A AmCyan-A Brilliant Violet 605-A Brilliant Violet 650-A Brilliant Violet 705-A Brilliant Violet 785-A APC-A Alexa Fluor 700-A APC-Cy7-A PE-A PE-Texas Red-A PE-Cy5-A PE-Cy5-5-A PE-Cy7-A Time

When I call gt_gating I get this message Gating for ' APC-Cy7-A+' Error in getChannelMarker(frm, channel) : can't find APC-Cy7-A

What I have tried I have tried changing the hyphens to dots and I have tried using the marker names instead of the fluorochrome names (i.e. "Dump" instead of "APC-Cy7-A") but this does not solve the problem .

I have tried to include a reproducible example below

fcs files can be acquired from https://flowrepository.org/experiments/1515/download_ziped_files

load packages -----------------------------------------------------------

library(tidyverse) library(openCyto) library(ncdfFlow) library(flowAI) library(flowWorkspace) library(flowWorkspaceData) library(ggcyto)

load data --------------------------------------------------------------

fcs_files <- list.files( "data/b_cells/", pattern = "*.fcs", full.names = TRUE )

fs <- read.ncdfFlowSet(fcs_files)

gating ------------------------------------------------------------------

create gating template csv

gt_tbl <- structure(list(alias = c("nonDebris", "singlets", "lymph", "*" ), pop = c("+", "+", "+", "+/-+/-"), parent = c("root", "nonDebris", "singlets", "lymph"), dims = c("FSC-A", "FSC-A,FSC-H", "FSC-A,SSC-A", "Brilliant Violet 785-A, APC-Cy7-A"), gating_method = c("gate_mindensity", "singletGate", "flowClust", "mindensity"), gating_args = c(NA, NA, "K = 2, target=c(1e5,5e4)", NA), collapseDataForGating = c(NA, NA, NA, NA), groupBy = c(NA, NA, NA, NA), preprocessing_method = c(NA, NA, "prior_flowClust", NA), preprocessing_args = c(NA, NA, "K = 2", NA)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(cols = list(alias = structure(list(), class = c("collector_character", "collector")), pop = structure(list(), class = c("collector_character", "collector")), parent = structure(list(), class = c("collector_character", "collector")), dims = structure(list(), class = c("collector_character", "collector")), gating_method = structure(list(), class = c("collector_character", "collector")), gating_args = structure(list(), class = c("collector_character", "collector")), collapseDataForGating = structure(list(), class = c("collector_logical", "collector")), groupBy = structure(list(), class = c("collector_logical", "collector")), preprocessing_method = structure(list(), class = c("collector_character", "collector")), preprocessing_args = structure(list(), class = c("collector_character", "collector"))), default = structure(list(), class = c("collector_guess", "collector")), skip = 1), class = "col_spec"))

write_csv(gt_tbl, "bcell_gating_template.csv")

create gating template

gt <- gatingTemplate("bcell_gating_template.csv")

create empty gating set

gs <- GatingSet(fs)

apply gating strategy to gating set

gt_gating(gt, gs)

--ERROR--

Gating for ' APC-Cy7-A+' Error in getChannelMarker(frm, channel) : can't find APC-Cy7-A

sessionInfo()

> R version 4.0.2 (2020-06-22)

> Platform: x86_64-w64-mingw32/x64 (64-bit)

> Running under: Windows 10 x64 (build 18362)

>

> Matrix products: default

>

> locale:

> [1] LC_COLLATE=English_United Kingdom.1252

> [2] LC_CTYPE=English_United Kingdom.1252

> [3] LC_MONETARY=English_United Kingdom.1252

> [4] LC_NUMERIC=C

> [5] LC_TIME=English_United Kingdom.1252

>

> attached base packages:

> [1] stats graphics grDevices utils datasets methods base

>

> other attached packages:

> [1] ggcyto_1.17.0 flowWorkspaceData_3.1.0

> [3] flowWorkspace_4.1.9 flowAI_1.19.4

> [5] ncdfFlow_2.35.1 BH_1.72.0-3

> [7] RcppArmadillo_0.9.900.3.0 flowCore_2.1.2

> [9] openCyto_2.1.2 forcats_0.5.0

> [11] stringr_1.4.0 dplyr_1.0.2

> [13] purrr_0.3.4 readr_1.3.1

> [15] tidyr_1.1.2 tibble_3.0.3

> [17] ggplot2_3.3.2 tidyverse_1.3.0

>

> loaded via a namespace (and not attached):

> [1] colorspace_1.4-1 ellipsis_0.3.1 mclust_5.4.6

> [4] cytolib_2.1.18 corpcor_1.6.9 base64enc_0.1-3

> [7] fs_1.5.0 clue_0.3-57 hexbin_1.28.1

> [10] IDPmisc_1.1.20 fansi_0.4.1 mvtnorm_1.1-1

> [13] lubridate_1.7.9 xml2_1.3.2 splines_4.0.2

> [16] R.methodsS3_1.8.1 mnormt_2.0.2 robustbase_0.93-6

> [19] knitr_1.30 jsonlite_1.7.1 broom_0.7.0

> [22] cluster_2.1.0 dbplyr_1.4.4 png_0.1-7

> [25] R.oo_1.24.0 graph_1.67.1 rrcov_1.5-5

> [28] compiler_4.0.2 httr_1.4.2 backports_1.1.10

> [31] assertthat_0.2.1 Matrix_1.2-18 cli_2.0.2

> [34] htmltools_0.5.0 tools_4.0.2 gtable_0.3.0

> [37] glue_1.4.2 reshape2_1.4.4 Rcpp_1.0.5

> [40] Biobase_2.49.1 cellranger_1.1.0 vctrs_0.3.4

> [43] changepoint_2.2.2 xfun_0.17 rvest_0.3.6

> [46] lifecycle_0.2.0 gtools_3.8.2 XML_3.99-0.5

> [49] DEoptimR_1.0-8 zoo_1.8-8 zlibbioc_1.35.0

> [52] MASS_7.3-51.6 scales_1.1.1 RProtoBufLib_2.1.0

> [55] hms_0.5.3 parallel_4.0.2 RBGL_1.65.0

> [58] RColorBrewer_1.1-2 yaml_2.2.1 curl_4.3

> [61] gridExtra_2.3 aws.signature_0.6.0 latticeExtra_0.6-29

> [64] stringi_1.5.3 highr_0.8 S4Vectors_0.27.13

> [67] pcaPP_1.9-73 flowClust_3.27.0 BiocGenerics_0.35.4

> [70] flowViz_1.53.0 rlang_0.4.7 pkgconfig_2.0.3

> [73] matrixStats_0.56.0 evaluate_0.14 fda_5.1.5.1

> [76] lattice_0.20-41 ks_1.11.7 tidyselect_1.1.0

> [79] plyr_1.8.6 magrittr_1.5 R6_2.4.1

> [82] generics_0.0.2 DBI_1.1.0 pillar_1.4.6

> [85] haven_2.3.1 withr_2.3.0 modelr_0.1.8

> [88] crayon_1.3.4 KernSmooth_2.23-17 ellipse_0.4.2

> [91] tmvnsim_1.0-2 rmarkdown_2.4 aws.s3_0.3.21

> [94] jpeg_0.1-8.1 grid_4.0.2 readxl_1.3.1

> [97] data.table_1.13.0 blob_1.2.1 Rgraphviz_2.33.0

> [100] reprex_0.3.0 digest_0.6.25 R.utils_2.10.1

> [103] flowStats_4.1.0 RcppParallel_5.0.2 stats4_4.0.2

> [106] munsell_0.5.0

jacobpwagner commented 3 years ago

I think this is just a matter of an extra space here:

"Brilliant Violet 785-A, APC-Cy7-A"

You want

"Brilliant Violet 785-A,APC-Cy7-A"

When openCyto is parsing your csv template, it's just splitting on the comma. So the resulting channel on the right is " APC-Cy7-A" instead of "APC-Cy7-A" (you can actually see this extra space in your error). After removing that extra space, gt_gating completes successfully in my testing.

gfinak commented 3 years ago

Shall we trim whitespace on the ends after splitting the arguments?

jacobpwagner commented 3 years ago

Yeah, I was thinking that may be a good idea, as I can't really imagine a case when leading/trailing spaces would be intentional. I can make that quick change.

jacobpwagner commented 3 years ago

Done in https://github.com/RGLab/openCyto/commit/aa2cf70b86997642e110928a23caaf36967d0a9b. Now it completes successfully even with the leading spaces.

wjs20 commented 3 years ago

Yes that has solved my issue. Thanks for your help.