RGLab / flowWorkspace

flowWorkspace
GNU Affero General Public License v3.0
44 stars 21 forks source link

'gs_pop_get_data' return inverse transformed data #337

Closed aspergatus closed 4 years ago

aspergatus commented 4 years ago

Maybe I understand the tutorial description wrongly. https://www.bioconductor.org/packages/release/bioc/vignettes/flowWorkspace/inst/doc/flowWorkspace-Introduction.html

Here it is written "Because GatingSet is a purely reference class, the class type returned by getData is a cytoset, which is the purely reference class analog of a flowSet and will be discussed in more detail below. Also note that the data is already compensated and transformed during the parsing."

But the function definitely returns the inverse transformed data as default! Can you please have a look.

jacobpwagner commented 4 years ago

How so?:

> library(flowWorkspace)
> library(CytoML)
> library(ggcyto)
> path <- system.file("extdata",package="flowWorkspaceData")
> wsfile <- list.files(path, pattern="A2004Analysis.xml", full = TRUE)
> ws <- open_flowjo_xml(wsfile)
> gs <- flowjo_to_gatingset(ws,name = 1); #import the first group
> # After biexponential transformations
> cs_transformed <- gs_pop_get_data(gs)
> range(cs_transformed[[1]])
     FSC-A  FSC-H  FSC-W  SSC-A  SSC-H  SSC-W <Am Cyan-A> Am Cyan-H <Pacific Blue-A> Pacific Blue-H   <APC-A>    APC-H <APC-CY7-A> APC-CY7-H <Alexa 700-A> Alexa 700-H    <FITC-A>   FITC-H
min   -111      0      0   -111      0      0   -1.436443   455.000        -1.436443        455.000  211.2436  455.000   -1.436443   455.000     -1.436443     455.000   -1.436443  455.000
max 262143 262143 262143 262143 262143 262143 4096.836914  4096.837      4096.836914       4096.837 4096.8369 4096.837 4096.836914  4096.837   4096.836914    4096.837 4096.836914 4096.837
    <PerCP-CY5-5-A> PerCP-CY5-5-H  <PE-CY7-A> PE-CY7-H    Time
min       -1.436443       455.000   -1.436443  455.000    0.00
max     4096.836914      4096.837 4096.836914 4096.837 2621.43
> ggcyto(cs_transformed[[1]], aes("APC-H")) +
+   geom_density()
> # Back on instrument scale
> cs_inverted <- gs_pop_get_data(gs, inverse.transform = TRUE)
> range(cs_inverted[[1]])
     FSC-A  FSC-H  FSC-W  SSC-A  SSC-H  SSC-W <Am Cyan-A> Am Cyan-H <Pacific Blue-A> Pacific Blue-H  <APC-A>    APC-H <APC-CY7-A> APC-CY7-H <Alexa 700-A> Alexa 700-H <FITC-A>   FITC-H
min   -111      0      0   -111      0      0      -111.0       0.0           -111.0            0.0    -54.6      0.0      -111.0       0.0        -111.0         0.0   -111.0      0.0
max 262143 262143 262143 262143 262143 262143    262143.1  262143.1         262143.1       262143.1 262143.1 262143.1    262143.1  262143.1      262143.1    262143.1 262143.1 262143.1
    <PerCP-CY5-5-A> PerCP-CY5-5-H <PE-CY7-A> PE-CY7-H    Time
min          -111.0           0.0     -111.0      0.0    0.00
max        262143.1      262143.1   262143.1 262143.1 2621.43
> ggcyto(cs_inverted[[1]], aes("APC-H")) +
+   geom_density()

There are no transformations applied to the scatter channels, but for the other channels you can clearly see that gs_pop_get_data is kicking out channels on transformed 4096 scale as opposed to 262144 instrument scale when we add in inverse.transform = TRUE.

jacobpwagner commented 4 years ago

Sorry. Forgot the plots:

> ggcyto(cs_transformed[[1]], aes("APC-H")) +
+   geom_density()

image

> ggcyto(cs_inverted[[1]], aes("APC-H")) +
+   geom_density()

image

jacobpwagner commented 4 years ago

getData is deprecated and renamed to gs_pop_get_data, but the redirect will occur with a warning. So even if you use getData above, you'll get the same results.

aspergatus commented 4 years ago

Hi Jacob,

Thank you for looking at it so quickly. I understood that 4096 is untransformed data (the original data written in fcs files) and 262143 is transformed data (after linearize-with-PnG-scaling). Am I wrong?