RGLab / openCyto

A package that provides data analysis pipeline for flow cytometry.
GNU Affero General Public License v3.0
75 stars 29 forks source link

MFI values from gates #211

Closed browshanravan closed 4 years ago

browshanravan commented 4 years ago

I am trying to get population statistics such as mean and/or median FI from my samples gated using openCyto.

For my transformed GatingSet (gs), I have tried gs_pop_get_stats(gs, nodes= c("gate1", "gate2"), type = pop.MFI, inverse.transform = TRUE) in order to get the inverse transformed median fluorescence intensity for each marker within gate1 and gate2. However all I get back is Null data.table (0 rows and 0 cols). For each gate I have executed the recompute(gs) command, when making them.

I could potentially change each gate back into a flowSet and calculate the MFIs that way but that would be very long-winded and painful. Any suggestions?

mikejiang commented 4 years ago

make sure "gate1" is the correct gating path in your gs, try to paste the output of gs_get_pop_paths(gs) to double check

browshanravan commented 4 years ago

Yes, I get the correct path. Here is my output [1] "root" "/NonDebris" "/NonDebris/singlets" "/NonDebris/singlets/gate1" "/NonDebris/singlets/gate2".

Even the plot(gs) works as I expect. I have imported these samples from flowCore using gs<- GatingSet(my_flowSet) although I'm not sure if that makes a difference.

browshanravan commented 4 years ago

And I have also pasted the paths output of gs_get_pop_paths(gs) into the gs_pop_get_stats() command above but I get the same thing.

mikejiang commented 4 years ago

Can you reproduce this example code ?

dataDir <- system.file("extdata",package="flowWorkspaceData")
suppressMessages(gs <- load_gs(list.files(dataDir, pattern = "gs_manual",full = TRUE)))

nodes <- c("CD4", "CD8")
# compute the stats based on the raw data scale
gs_pop_get_stats(gs, nodes, type = pop.MFI, inverse.transform = TRUE)
                   sample pop CD4 PcpCy55  CD38 APC  CD8 APCH7 CD3 V450 HLA-DR V500  CCR7 PE CD45RA PECy7
1: CytoTrol_CytoTrol_1.fcs CD4  18102.2500 1058.0019   143.6824 6508.279    571.9359 1950.782     765.5124
2: CytoTrol_CytoTrol_1.fcs CD8    243.7508  970.9595 94714.4328 5582.927    718.7893 1356.882   23314.0585
browshanravan commented 4 years ago

Yes, the code provided works for me in terms of giving me the same output.

mikejiang commented 4 years ago

What is the output of gh_pop_get_count(gs[[1]], "gate1")?

browshanravan commented 4 years ago

It gives me [1] 2600 which I suspect is the number of cells in my gate.

gfinak commented 4 years ago

Do you have a transformation associated with the GatingSet? Can you generate dotplots of the gated populations? Does omitting inverse.trans produce expected results?

browshanravan commented 4 years ago

Yes, I can easily generate dot plots of all my gates using autoplot(gs[[1]]) and all the gates and the associated percentages show up fine, with all the markers correctly transformed.

I transform my GatingSet in openCyto/flowWorkspace environment using the code tf<- estimateLogicle(gs[[1]], wanted_stains) followed by gs<- transform(gs, tf). I can then easily asses the individual flowFrames and check that they have been transformed using the command gs_pop_get_data(gs)[[1]].

Omitting the inverse.transform makes no difference and I get the same output, which is Null data.table (0 rows and 0 cols).

jacobpwagner commented 4 years ago

Can you give a reproducible example starting from a fresh R session (every line of code from initial library statements to the gh_pop_get_stats call)? I can substitute other data if that can't be shared, but if I follow all of your steps I may be able to figure out where this is getting tripped up.

mikejiang commented 4 years ago

how about the output of

fr <- gh_pop_get_data(gs[[1]])
pData(parameters(fr))

and

parameters(gh_pop_get_gate(gs[[1]], "gate1"))

pop.MFI only computes the median fluorescence intensity for each marker. If you don't have markers associated with channels, it might give you empty result

browshanravan commented 4 years ago

@mikejiang The first set of commands give me

 name desc  range    minRange maxRange
$P1          FSC-A <NA> 262144  0.00000000 262143.0
$P2          FSC-H <NA> 262144  0.00000000 262143.0
$P3          FSC-W <NA> 262144  0.00000000 262143.0
$P4          SSC-A <NA> 262144  0.00000000 262143.0
$P5          SSC-H <NA> 262144  0.00000000 262143.0
$P6          SSC-W <NA> 262144  0.00000000 262143.0
$P7          GFP-A <NA> 262144 -0.01400161      4.5
$P8 Pacific Blue-A <NA> 262144 -0.07888681      4.5
$P9           Time <NA> 262144  0.00000000 262143.0

and the second one gives me

 GFP-A   Pacific Blue-A 
         "GFP-A" "Pacific Blue-A" 

Which is what you would expect?

@jacobpwagner I will get it done and get back to you asap!

jacobpwagner commented 4 years ago

Actually, it may not be necessary. I think Mike is right. Note that your desc column is all <NA>. You don't have marker names associated with channels, which is why pop.MFI is giving an empty result.

mikejiang commented 4 years ago

So the current pop.MFI won't work for your data since it requires non-NA desc column from your data. You can supply your own version of it to meet your needs

my.pop.MFI <- function(fr){
  pd <- pData(parameters(fr))
  pd <- data.table(pd)
  chnls  <- pd[, name]
  res <- colMedians(exprs(fr)[, chnls, drop = FALSE])
  res
}
gs_pop_get_stats(gs, nodes= c("gate1", "gate2"), type = my.pop.MFI, inverse.transform = TRUE)
browshanravan commented 4 years ago

@mikejiang it works but the column headers for the MFIs are V1,V2 etc till V9 which I'm not sure what they are and why there is 9 of them.

mikejiang commented 4 years ago

add names(res) <- chnls before returning res in your my.pop.MFI function, you can remove the extra columns that you don't need after you get the data.table.

browshanravan commented 4 years ago

@mikejiang That's fantastic thank you so much! It works!

Thank you very much @mikejiang @jacobpwagner and @gfinak for giving me your time and resolving my issue. Much appreciated!