RGLab / flowStats

flowStats: algorithms for flow cytometry data analysis using BioConductor tools
15 stars 10 forks source link

error when using "spillover_match" #29

Open vhollek opened 4 years ago

vhollek commented 4 years ago

I'm new to R and openCyto/flowCore etc. For analyzing my FACS data, I first want to compensate them using spillover_match, spillover and compensate. I got all the necessary single stained fcs samples (including unstained) and created a csv file containing filenames and channels. Unfortunately, I get the error

Error in value[[3L]] (cond): The flowSet spillover_match method has been moved to the flowStats package. Please library(flowStats) first.

even though I did library(flowStats) [Version 3.44.0].

My code:

fcs.dir <- system.file("extdata", "compdata", "data", "20200122", package="flowCore") frames <- lapply(dir(fcs.dir, full.names=TRUE), read.FCS, emptyValue = FALSE) fs <- as(frames, "flowSet") fs

flowStats::spillover_match(x = fs, fsc = "FSC-A", ssc = "SSC-A", matchfile = "FACSexp/20200122/compensation.csv", path = "FACSexp/20200122/")

[version R version 3.6.2 (2019-12-12) os Windows 10 x64
system x86_64, mingw32]

Does anybody know what I can do/change to make it work?

jacobpwagner commented 4 years ago

Some aspects of your code don't seem to work in my hands. In particular, the fcs.dir you specify does not exist. I'll construct an example using the FCS files I think you were tryng to steer towards and use it to illustrate the method.

> library(flowCore)
> library(flowStats)
> fcs.dir <- system.file("extdata", "compdata", "data", package="flowCore")
> matchfile <- system.file("extdata", "compdata", "comp_match", package="flowCore")

The matchfile is just a simple 2-column csv mapping channel to control file

> writeLines(readLines(matchfile))
filename,channel
060909.001,unstained
060909.002,FL1-H
060909.004,FL4-H
060909.005,FL3-H
060909.003,FL2-H

Note you could use lapply with read.FCS as you were (if the paths were correct), but read.flowSet is a bit easier

> frames <- read.flowSet(list.files(fcs.dir, full.names=TRUE))
> sampleNames(frames)
[1] "060909.001" "060909.002" "060909.003" "060909.004" "060909.005"

spillover_match updates the sample names of the flowSet using the matchfile. Unfortunately the scatter channels in this example data are -H instead of the default -A, so we need to specify that.

> matched <- spillover_match(frames, fsc="FSC-H", ssc="SSC-H", matchfile=matchfile)
> sampleNames(matched)
[1] "unstained" "FL1-H"     "FL2-H"     "FL4-H"     "FL3-H"    

So now the spillover matrix calculation method will know which control sample goes with which channel. Note the prematched=TRUE flag here, which will tell the spillover method not to attempt its automated logic to match control files to channels, but use the sample names as they are.

> comp_matrix <- spillover(matched, fsc="FSC-H", ssc="SSC-H", prematched=TRUE)
> comp_matrix
             FL1-H        FL2-H       FL3-H       FL4-H
FL1-H 1.0000000000 0.2420222776 0.032083706 0.001127816
FL2-H 0.0077220477 1.0000000000 0.140788232 0.002632689
FL3-H 0.0150806322 0.1755899032 1.000000000 0.229593860
FL4-H 0.0007590319 0.0009620459 0.003218614 1.000000000

That compensation matrix could then be used with the compensate method. Note that you don't need to use spillover_match. You could manually reassign the sampleNames of each of the controls in the flowSet yourself, like so:

sampleNames(frames) <- c("unstained", "FL1-H", "FL2-H", "FL4-H", "FL3-H")

And then still used the prematched=TRUE flag in spillover. The spillover_match functionality mostly helps for repeated batch processing.

Also, keep in mind that the spillover method is sort of overloaded. If you call spillover on a single flowFrame, it will attempt to extract the spillover matrix included in the FCS file using the appropriate keywords. If you call spillover on a flowSet, it will attempt to calculate the spillover matrix, assuming the set of flowFrames represent a set of controls.

Please let me know if you have any more questions.

jacobpwagner commented 4 years ago

Regarding the error, that should not be happening and I don't see it in my local testing. But without a reproducible specific code chunk, it is hard for me to find out why you are seeing it. Your code chunk gives me other errors (because the directory doesn't exist). If you update it, I can figure out the cause.

vhollek commented 4 years ago

Thank you for your answer!

I tried it again with the same samples and used exactly your code but unfortunately it still doesn't work.

> fcs.dir <- system.file("extdata", "compdata", "data", package="flowCore")
> matchfile <- system.file("extdata", "compdata", "comp_match", package="flowCore")
> 
> writeLines(readLines(matchfile))
filename,channel
060909.001,unstained
060909.002,FL1-H
060909.004,FL4-H
060909.005,FL3-H
060909.003,FL2-H
> frames <- read.flowSet(list.files(fcs.dir, full.names=TRUE))
> sampleNames(frames)
[1] "060909.001" "060909.002" "060909.003" "060909.004" "060909.005"
> 
> spillover_match(frames, fsc="FSC-H", ssc="SSC-A", matchfile=matchfile)
Error in value[[3L]](cond) : 
  The flowSet spillover_match method has been moved to the flowStats package.
               Please library(flowStats) first.

(I reinstalled and libraried flowStats again but still gives me that error...)

vhollek commented 4 years ago

The traceback() looks like the following:

6: stop("The flowSet spillover_match method has been moved to the flowStats package.\n Please library(flowStats) first.") 5: value[3L] 4: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 3: tryCatchList(expr, classes, parentenv, handlers) 2: tryCatch(standardGeneric("spillover_match"), error = function(e) { if (is(x, "flowSet")) { stop("The flowSet spillover_match method has been moved to the flowStats package.\n Please library(flowStats) first.") } else { stop(e) } }) 1: flowStats::spillover_match(frames, fsc = "FSC-H", ssc = "SSC-A", matchfile = matchfile)

vhollek commented 4 years ago

I also tried to create a compensation matrix with spillover after setting the sample names as you suggested. First, it didn't work and gave me the error: > the number of single stained samples provided in this set doesn't match to the number of stained channels! After removing the "FLA-1" channel I could get the matrix. And this also worked for my own samples. Thank you!

But I am still wondering why spillover_match doesn't work, as it is in the same package as spillover?!

jacobpwagner commented 4 years ago

Hmm. Could you give me the full output of sessionInfo() as well as the full exact code (including library statements run in the session before calling spillover_match)?

And preferably sessionInfo() output right before the spillover_match call.

jacobpwagner commented 4 years ago

Also, your call to spillover_match is still using SSC-A there instead of SSC-H, which will cause a problem for that example dataset. If the same is true of your spillover call, that may explain the other error you saw.

vhollek commented 4 years ago
> library(xtable)
> library(testthat)
> library(openCyto)
> library(BiocManager)
Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for help
> library(flowWorkspace)
> library(data.table)
data.table 1.12.8 using 4 threads (see ?getDTthreads).  Latest news: r-datatable.com
> library(usethis)
> library(devtools)
Attache package: ‘devtools’
The following object is masked from ‘package:BiocManager’:
    install
The following object is masked from ‘package:testthat’:
    test_file
> library(RcppArmadillo)
> library(BH)
> library(ncdfFlow)
> library(CytoML)
> library(ggplot2)
> library(ncdfFlow)
> library(ggcyto)
> library(flowCore)
> library(lattice)
> library(flowViz)
> library(flowStats)
> fcs.dir <- system.file("extdata", "compdata", "data", package="flowCore")
> matchfile <- system.file("extdata", "compdata", "comp_match", package="flowCore")
> writeLines(readLines(matchfile))
filename,channel
060909.001,unstained
060909.002,FL1-H
060909.004,FL4-H
060909.005,FL3-H
060909.003,FL2-H
> frames <- read.flowSet(list.files(fcs.dir, full.names=TRUE))
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] flowStats_3.44.0          flowViz_1.50.0            lattice_0.20-38          
 [4] ggcyto_1.14.0             ggplot2_3.2.1             CytoML_1.12.0            
 [7] ncdfFlow_2.32.0           flowCore_1.52.1           BH_1.72.0-3              
[10] RcppArmadillo_0.9.800.3.0 devtools_2.2.1            usethis_1.5.1            
[13] data.table_1.12.8         flowWorkspace_3.34.1      BiocManager_1.30.10      
[16] openCyto_1.24.0           testthat_2.3.1            xtable_1.8-4             

loaded via a namespace (and not attached):
 [1] matrixStats_0.55.0  fs_1.3.1            RColorBrewer_1.1-2  rprojroot_1.3-2    
 [5] Rgraphviz_2.30.0    backports_1.1.5     tools_3.6.2         R6_2.4.1           
 [9] KernSmooth_2.23-16  lazyeval_0.2.2      BiocGenerics_0.32.0 colorspace_1.4-1   
[13] withr_2.1.2         tidyselect_0.2.5    gridExtra_2.3       prettyunits_1.1.1  
[17] mnormt_1.5-5        processx_3.4.1      compiler_3.6.2      graph_1.64.0       
[21] cli_2.0.1           Biobase_2.46.0      flowClust_3.24.0    desc_1.2.0         
[25] scales_1.1.0        DEoptimR_1.0-8      hexbin_1.28.0       mvtnorm_1.0-11     
[29] robustbase_0.93-5   callr_3.4.0         RBGL_1.62.1         stringr_1.4.0      
[33] digest_0.6.23       R.utils_2.9.2       base64enc_0.1-3     rrcov_1.4-9        
[37] jpeg_0.1-8.1        pkgconfig_2.0.3     sessioninfo_1.1.1   rlang_0.4.2        
[41] rstudioapi_0.10     jsonlite_1.6        mclust_5.4.5        gtools_3.8.1       
[45] dplyr_0.8.3         R.oo_1.23.0         magrittr_1.5        Matrix_1.2-18      
[49] Rcpp_1.0.3          munsell_0.5.0       fansi_0.4.1         lifecycle_0.1.0    
[53] R.methodsS3_1.7.1   yaml_2.2.0          stringi_1.4.4       MASS_7.3-51.5      
[57] zlibbioc_1.32.0     pkgbuild_1.0.6      plyr_1.8.5          grid_3.6.2         
[61] parallel_3.6.2      crayon_1.3.4        splines_3.6.2       ps_1.3.0           
[65] pillar_1.4.3        fda_2.4.8           corpcor_1.6.9       stats4_3.6.2       
[69] pkgload_1.0.2       XML_3.99-0.3        glue_1.3.1          latticeExtra_0.6-29
[73] remotes_2.1.0       RcppParallel_4.4.4  png_0.1-7           gtable_0.3.0       
[77] purrr_0.3.3         clue_0.3-57         assertthat_0.2.1    ks_1.11.6          
[81] IDPmisc_1.1.19      pcaPP_1.9-73        tibble_2.1.3        memoise_1.1.0      
[85] ellipse_0.4.1       cluster_2.1.0       ellipsis_0.3.0 
> spillover_match(rghtchnls, fsc="FSC-H", ssc="SSC-H", matchfile=matchfile)
Fehler in value[[3L]](cond) : 
  The flowSet spillover_match method has been moved to the flowStats package.
               Please library(flowStats) first.

That's my code including the loaded libraries and sessionInfo(). I hope you see something I didn't. If not I will continue with only using spillover and reassign the sample names manually.

Thanks so much for your time!

jacobpwagner commented 4 years ago

That code still cannot be exactly what you are running from a fresh R session, as you never define rghtchnls (so you should get an error about that object not being found). If I change rghtchnls to frames in the spillover_match call, it still succeeds in my testing across multiple versions of the packages, including the latest Bioconductor release (the versions you are using). You may have rghtchnls already defined in your environment, but then there could be something else interfering with flowStats or specifically spillover_match as well (which is what I'm trying to find out). So in order to figure out your problem, I still need self-contained reproducible code that will work from a fresh R session. That is, your example code should lead to the failing spillover_match call that will have all of its arguments defined after you 1) clear your environment, 2) restart R, and 3) run only that code.

I'm happy to keep searching and sorry for the need for a really precise reproducible example. This is just very strange because this code has been pretty stable and unchanged for a while now, so I'm not sure why this method dispatch would be failing in your hands.

vhollek commented 4 years ago

Oh I see that I just forgot to copy the line in which I definded rghtchnls (but I had it in my code and did run it) rghtchnls <- frames[,c(3,4,5,7,1,2)]. I did this because of the error that the number of single stained samples doesn't match to the channels in the matchfile. I tried it again this morning after restarting Windows and R and all of a sudden it did work, even with frames. So I tried to use my own samples but then got the error again. So maybe something is wrong with my samples... I will generate new ones and test it again. But at least it did work with the example data now...

Thanks so much for your help, I'm sorry for bothering you so long!

jacobpwagner commented 4 years ago

No worries at all. I was just really confused about why the spillover_match method dispatch didn't seem to be working. I'm happy to help troubleshoot why it might not be working with your data, but manually setting the sampleNames to match your channels for the spillover calculation method is also a fine solution and it sounds like that was working for you.