RGLab / CytoML

A GatingML Interface for Cross Platform Cytometry Data Sharing
GNU Affero General Public License v3.0
30 stars 14 forks source link

Loss of ability to use gates in imported workspace without transforming the data #110

Closed PedroMilanezAlmeida closed 4 years ago

PedroMilanezAlmeida commented 4 years ago

A few days ago I updated to R 4.0.2 and Bioc 3.11 and lost the ability to parse a FlowJo workspace using flowjo_to_gatingset using transform = FALSE while also applying the gates.

The feature described above was particularly useful when dealing with data that were transformed outside FlowJo but imported into FlowJo for gating. For us, this functionality is very important for analysis of CITEseq data.

I wonder whether it would be possible to re-enable the use of gates without data transformation (i.e., get "raw gates").

Here are the gates in FlowJo and in R:

Rplot FJ_plot

In case helpful, for the gate above, the info in the wsp/xml file is:

Screen Shot 2020-08-03 at 5 35 43 PM Screen Shot 2020-08-03 at 5 38 21 PM Screen Shot 2020-08-03 at 5 39 20 PM

This is the info from R:

Screen Shot 2020-08-03 at 5 40 42 PM
PedroMilanezAlmeida commented 4 years ago
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggcyto_1.16.0             ncdfFlow_2.34.0           BH_1.72.0-3              
 [4] RcppArmadillo_0.9.900.2.0 ggplot2_3.3.2             flowCore_2.0.1           
 [7] flowUtils_1.52.0          CytoML_2.0.5              flowWorkspace_4.0.6      
[10] FlowSOM_1.20.0            igraph_1.2.5              XML_3.99-0.5             
[13] magrittr_1.5              glue_1.4.1                Rcpp_1.0.5               
[16] gridExtra_2.3             pheatmap_1.0.12           foreach_1.5.0            
[19] BiocManager_1.30.10      

loaded via a namespace (and not attached):
 [1] lattice_0.20-41             corpcor_1.6.9               RProtoBufLib_2.0.0         
 [4] png_0.1-7                   digest_0.6.25               R6_2.4.1                   
 [7] plyr_1.8.6                  stats4_4.0.2                pillar_1.4.6               
[10] zlibbioc_1.34.0             rlang_0.4.7                 rstudioapi_0.11            
[13] data.table_1.13.0           Rgraphviz_2.32.0            hexbin_1.28.1              
[16] RUnit_0.4.32                labeling_0.3                stringr_1.4.0              
[19] munsell_0.5.0               compiler_4.0.2              pkgconfig_2.0.3            
[22] ConsensusClusterPlus_1.52.0 BiocGenerics_0.34.0         base64enc_0.1-3            
[25] tidyselect_1.1.0            tibble_3.0.3                codetools_0.2-16           
[28] matrixStats_0.56.0          crayon_1.3.4                dplyr_1.0.1                
[31] withr_2.2.0                 grid_4.0.2                  RBGL_1.64.0                
[34] tsne_0.1-3                  jsonlite_1.7.0              gtable_0.3.0               
[37] lifecycle_0.2.0             scales_1.1.1                graph_1.66.0               
[40] RcppParallel_5.0.2          stringi_1.4.6               farver_2.0.3               
[43] latticeExtra_0.6-29         xml2_1.3.2                  ellipsis_0.3.1             
[46] generics_0.0.2              vctrs_0.3.2                 RColorBrewer_1.1-2         
[49] iterators_1.0.12            tools_4.0.2                 Biobase_2.48.0             
[52] purrr_0.3.4                 jpeg_0.1-8.1                parallel_4.0.2             
[55] yaml_2.2.1                  colorspace_1.4-1            cluster_2.1.0              
[58] cytolib_2.0.3
PedroMilanezAlmeida commented 4 years ago

What is weird is that, on a different dataset (simulated in R, imported into FJ but no FJ-data transformation applied [linear]), flowjo_to_gatingset does NOT distort the gate coordinates:

image Rplot

image

image

And, in the first dataset (CITEseq) the first gate is NOT distorted neither (CD11b-CD14-, not shown here).

PedroMilanezAlmeida commented 4 years ago

@gfinak Thanks for your reply!

mikejiang commented 4 years ago

transform = FALSE is meant to only parse the raw gate(for troubleshooting purpose) without doing any gating or data loading. It is only valid when execute = FALSE, which is not relevant to what you are trying to do. In fact, you should see a message when you try to disable transform while still attempt to proceed parsing

> gs <- flowjo_to_gatingset(ws, name = 4, transform = F, subset = 1)
'transform = false' is ignored when 'is_gating' is set to true!
> gs
A GatingSet with 1 samples

You shouldn't need to fiddle with transform flag, the parser is supposed to import whatever is recorded in flowJo workspace, doesn't matter whether data and gates are transformed or linear.

The right question is why your first gate is distorted, which may require a reproducible example to trouble shoot. (feel free to zip it and send it to wjiang2@fredhutch.org)

PedroMilanezAlmeida commented 4 years ago

Hey Mike, thanks for your comment.

I understand that transform was only meant to be used in troubleshooting. However, until now, often when I had non-standard/non-FCS data that I needed to analyze in FlowJo and reimport in R, turning off transform was the only way for the gates to be parsed correctly.

Thanks again for your help, I will send you a reproducible example via email later today.

mikejiang commented 4 years ago

Again, It has nothing to do with transformation since all the channels are on linear scale. It turns out to be the gate extension issue, which is discussed in #106 . Before we figure out the more robust way to automatically determine whether to trigger the gate extension (to the far lower end of axis), you can manually turn it off (by lowering the threshold extend_val) to get your correct gates back.

gs <- flowjo_to_gatingset(ws,
                                  name = 1,
                                  subset = "CITEseq_downsampled.fcs",
                                  sampNloc = "sampleNode"
                          , extend_val = -Inf
      )
autoplot(gs[[1]],
         gs_get_pop_paths(gs[[1]])[4])

image

> gh_pop_compare_stats(gs[[1]])
   openCyto.freq xml.freq openCyto.count xml.count        node
1:     1.0000000 1.000000           5000      5000        root
2:     1.0000000 1.000000           5000      5000   all cells
3:     0.8800000 0.882200           4400      4411 CD11b-CD14-
4:     0.1488636 0.148039            655       653   CD19-CD3-
PedroMilanezAlmeida commented 4 years ago

Cool!!! Thank you so much! I'll give it try and let you know how it goes! Thanks again!

EDIT: @mikejiang it solved my issue!