RGLab / flowWorkspace

flowWorkspace
GNU Affero General Public License v3.0
45 stars 21 forks source link

Export gated cell populations from FlowJo workspace to a dataframe #381

Open rohitfarmer opened 2 years ago

rohitfarmer commented 2 years ago

Hi there, I am working to export gated cell populations from a FlowJo workspace to an R data frame. My code fetches cells for the first few gates, and then I get a blank matrix; there are no errors. Any suggestions would be helpful. Thanks!

Below is the code for which I can fetch cells.

library(CytoML)
library(flowWorkspace)

# Load FlowJo workspace (xml) file
wsFile <- file.path(file.path("covidflu", "WSP-without-id", "20210727_COVID_FLU(act T cells).wsp"))
ws <- CytoML::open_flowjo_xml(file.path("covidflu", "WSP-without-id", "20210727_COVID_FLU(act T cells).wsp"))

# Parse 
gs <- flowjo_to_gatingset(ws, name = 2, path = file.path("covidflu", "all-fcs-files"), execute = TRUE)

getdat <- gs_pop_get_data(gs, y = "/PBMC/Single Cells/Live/CD45+")
ff <- flowWorkspace::cytoframe_to_flowFrame(getdat[[1,]])
nrow(flowCore::exprs(ff))

[1] 883444

And below is the same code with the next gate beyond which I am getting nothing.

getdat <- gs_pop_get_data(gs, y = "/PBMC/Single Cells/Live/CD45+/Lymphocytes")
ff <- flowWorkspace::cytoframe_to_flowFrame(getdat[[1,]])
nrow(flowCore::exprs(ff))

[1] 0

Here is my sessionInfo()

> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] flowWorkspace_4.6.0 CytoML_2.6.0       

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0     lattice_0.20-45      colorspace_2.0-3     vctrs_0.4.2          generics_0.1.3       stats4_4.1.3        
 [7] yaml_2.3.5           ncdfFlow_2.40.0      base64enc_0.1-3      utf8_1.2.2           flowCore_2.6.0       RBGL_1.70.0         
[13] XML_3.99-0.11        rlang_1.0.6          hexbin_1.28.2        pillar_1.8.1         glue_1.6.2           DBI_1.1.3           
[19] aws.s3_0.3.21        Rgraphviz_2.38.0     BiocGenerics_0.40.0  RColorBrewer_1.1-3   readxl_1.4.1         plyr_1.8.7          
[25] matrixStats_0.62.0   jpeg_0.1-9           lifecycle_1.0.3      MatrixGenerics_1.6.0 zlibbioc_1.40.0      RProtoBufLib_2.6.0  
[31] cellranger_1.1.0     munsell_0.5.0        gtable_0.3.1         cytolib_2.6.2        latticeExtra_0.6-30  Biobase_2.54.0      
[37] IRanges_2.28.0       curl_4.3.3           fansi_1.0.3          Rcpp_1.0.9           scales_1.2.1         DelayedArray_0.20.0 
[43] S4Vectors_0.32.4     jsonlite_1.8.2       RcppParallel_5.1.5   graph_1.72.0         deldir_1.0-6         interp_1.1-3        
[49] gridExtra_2.3        ggplot2_3.3.6        png_0.1-7            digest_0.6.29        dplyr_1.0.10         grid_4.1.3          
[55] cli_3.4.1            tools_4.1.3          magrittr_2.0.3       tibble_3.1.8         aws.signature_0.6.0  pkgconfig_2.0.3     
[61] Matrix_1.5-1         data.table_1.14.2    xml2_1.3.3           assertthat_0.2.1     httr_1.4.4           rstudioapi_0.14     
[67] R6_2.5.1             ggcyto_1.22.0        compiler_4.1.3 
miosisoniii commented 1 year ago

@rohitfarmer are you looking for population statistics? You can try gs_pop_get_count_fast, which produces a matrix, which could then just be piped into a dataframe:

# Load FlowJo workspace (xml) file
wsFile <- file.path(file.path("covidflu", "WSP-without-id", 
"20210727_COVID_FLU(act T cells).wsp"))
ws <- CytoML::open_flowjo_xml(file.path("covidflu", "WSP-without-id", "20210727_COVID_FLU(act T cells).wsp"))

# Parse 
gs <- flowjo_to_gatingset(ws, 
name = 2, 
path = file.path("covidflu", "all-fcs-files"), 
execute = TRUE)

# Using pop_get_count_fast 
getdat <- gs_pop_get_count_fast(gs,
statistic = "freq",
format = "wide",
xml = TRUE) 

# coerce to dataframe since the function produces a matrix
getdat_df <- get_dat |> 
as.data.frame()
rohitfarmer commented 1 year ago

@miosisoniii no, I am interested in exporting individual cells with their marker values and time stamp. I can export them now; however, during the export, the values are being transformed that I cannot reverse. Therefore, values are not the same if I match them with the same population exported from FlowJo.

mikejiang commented 1 year ago

first of all, to fetch expression data matrix, you do not need to convert it to flowframe, exprs(getdat[[1]]) should do

secondly, /PBMC/Single Cells/Live/CD45+/Lymphocytes gives you zero count, simply means there is no cell in that gate.

Finally, expression data matrix is stored as transformed scale after parsed from flowjo workspace into gatingset, in order to get raw scale, you switch inverse.transform flag for example

> dataDir <- system.file("extdata",package="flowWorkspaceData")
>   gs_dir <- list.files(dataDir, pattern = "gs_manual",full = TRUE)
> gs <<- load_gs(gs_dir)
> head(flowCore::exprs(gs_pop_get_data(gs, "CD4")[[1]])[,5:7])
     <B710-A> <R660-A>  <R780-A>
[1,] 3106.004 3302.719 2073.3540
[2,] 3128.845 1834.073 1607.8027
[3,] 2902.931 2458.440  482.8756
[4,] 2928.725 1382.240 1510.5111
[5,] 2832.599 1277.941  714.8516
[6,] 2727.793 1704.678  678.8153
> head(flowCore::exprs(gs_pop_get_data(gs, "CD4", inverse.transform = T)[[1]])[,5:7])
      <B710-A>   <R660-A>  <R780-A>
[1,] 21521.121 35399.8320 2342.2910
[2,] 22785.982  1105.2427  958.1031
[3,] 12979.613  4470.3438 -667.2806
[4,] 13837.313   431.4819  779.5670
[5,] 10906.154   341.7943 -326.7503
[6,]  8425.668   843.9904 -376.3401
rohitfarmer commented 1 year ago

Which Bioconductor version is the code from? It's not working for me. I had a similar problem before when I pointed out that the cell count from the gate was zero. I had to lower the Bioconductor version to make it work.

mikejiang commented 1 year ago

latest release. please provide your sessioninfo and reproducible sample in order for us to help you.