RGLab / flowWorkspace

flowWorkspace
GNU Affero General Public License v3.0
45 stars 21 forks source link

Add method to efficiently change the transformation on a single channel #324

Open jacobpwagner opened 4 years ago

jacobpwagner commented 4 years ago

The title pretty much describes it. If a user wants to alter the transformation on a single channel, the options currently available are:

  1. Invert all transformations when pulling cytoset by using inverse.transform flag, changing the single transformation, then re-applying the forward transformation to all channels.
  2. Manually pull out the single inverse transform and apply it by
    • Using gh_get_transformations to get the transformations
    • Pulling the cytoset and inverse transforming it using a transformList containing the single inverse
    • Transforming the cytoset with the new transformation
    • Scoping in to flowWorkspace:::set_transformations to set the transformations for the GatingSet so the gate coordinates are appropriately transformed

Option 1 is a little easier for users to put together, but it's significantly less efficient because it requires inverting and re-transforming all channels. Option 2 is more efficient in that it only does the inversion and re-transformation of a single channel, but it is more onerous for users.

Neither is ideal, and this is likely to be a fairly common operation so we should make it easier and more efficient. This may also be another reason to consider storing the data at the raw scale, as then this just requires changing the transformation object for the channel of interest, which will be applied when it is needed.

k-motwani commented 2 years ago

I think I have a question in a similar vein as the issue here, involving use of gh transformations. My goal is to obtain transformed axis breaks from a gatingset, which can then be piped to ggcyto. If there's a better way of accomplishing this task I'd appreciate any advice!

First question: Is there an easier way to get a single channel from all samples in a GatingSet? Included here is a reprex starting with a gatingset, and my initial attempt which can probably be vectorized or replaced with a gs function (suggestions would be appreciated):

data(GvHD)
fs <- GvHD[1:5]
gs <- fs %>% flowSet_to_cytoset() %>% GatingSet()

gs.markers <- gs %>% markernames()
gs.ff <- gs %>% gs_cyto_data() %>% cytoset_to_list() %>% lapply(cytoframe_to_flowFrame)
data.ff <- gs %>% gs_cyto_data() %>% cytoset_to_list() %>% lapply(cytoframe_to_flowFrame)
ff.cols <- data.ff %>% lapply(colnames) %>% Reduce(intersect, .)
data.raw <- vector('list'); i <- 1
for (c in which(ff.cols %in% names(gs.markers))) {
        data.raw[[i]] = gs.ff %>% lapply(function(ff) ff[, c])
        i<-i+1
}
data.pars.fs <- data.pars.ff %>% lapply(flowSet)

To complete this convoluted approach, the next step would be to obtain min/max axis breaks from this list of flowSets containing a single parameter. I considered doing this by splitting the flowset back to flowframes then following the below example in flow_breaks documentation:

fr <- GvHD[[1]]
trans <- logicleTransform()
inv <- inverseLogicleTransform(trans = trans)
myBrks <- flow_breaks(data.raw, equal.space = TRUE, trans = trans, inv = inv)

I'm guessing there's probably a better way to do most of this. Thanks in advance!