RGLab / openCyto

A package that provides data analysis pipeline for flow cytometry.
GNU Affero General Public License v3.0
75 stars 29 forks source link

Function or documentation to create derived parameter #214

Closed steveb-123 closed 3 years ago

steveb-123 commented 4 years ago

Thank you for your wonderful packages.

I use flow cytometry in the non-classical way, to analyse ratiometric experimental quantities, rather than count the proportion of of different populations within a sample.

Ultimately I would like to create derived parameters which function like their own channel in subsequent plotting/analysis/etc. As an example: I would like to create a new parameter with is the result of FL1.A/FL9.A. This should then be plot-able similar to any other channel.

I have found documentation on transform() functions but I am struggling to understand how and when to apply this to my problems, or if this is even the correct route. I am working in the context of gatingset, which is formed from a flowset of many samples.

In the short term I am selfishly seeking help to work this out, but I think it is a problem many cytometry scientists from non-immunology/haematology labs will come up against and thus should warrant either a dedicated (wrapper?) function and/or specific documentation.

Thanks!

gfinak commented 4 years ago

transform() is not the right mechanism for this. We don't have an api to produce derived parameters like this (at least not easily or directly), but we could think about adding support. Can you tell me more about your use case?

jacobpwagner commented 4 years ago

It is probably possible using fr_append_cols (for flowFrame) or cf_append_cols (for cytoframe) by 1) Pulling out the columns you need 2) Applying whatever mathematical combination (ratio, sum, etc) to build a named matrix of new columns 3) Using fr_append_cols or cf_append_cols to add them back in

I can put together a small example. But it could be worth adding a simple wrapper of that logic if it's going to be a common request.

steveb-123 commented 4 years ago

I carry out experiments using standard cell lines in culture. These will express either combination of fluorescent proteins, or artificial epitope labels for surface staining. Experiments typically explore molecular mechanisms where the assay measures the fluorecence of one channel while another channel is used for normalisation, or for a determining a ratio between the two.

In any given cell the fluoresence intensity on either channel holds little information without knowing the corresponding the intensity on the other. A useful way to explore this is to catculate the ratio between the channels, eg FL1.A/FL12.A.

Experiments like this are undergoing an explosion in popularity because the data are intrinsically quantitative, and provide a much-proved alternative to microscopy quantitation.

This concept is described more in the cytoflow github

some paper examples: here here

steveb-123 commented 4 years ago

In the meantime, is there a hack approach that I can take? (perhaps manipulating the flowset pre gating?)

steveb-123 commented 4 years ago

@jacobpwagner a small example would be greatly appreciated. And longer term, a wrapper would make a lot of sense.

As I said these kind of experiments pulling more and more people into flow cytometry. Part of the draw is that once a good fluorescent assay is developed, one can then turn it into a CRISPR screen and sort mutants.

jacobpwagner commented 4 years ago

If you are using flowSet (the older structure) as opposed to cytoset (the newer reference-based structure):

library(flowCore)
library(flowWorkspace)
data("GvHD")
fs <- GvHD

fs <- fsApply(fs, function(fr){
  existing_subset <- exprs(fr)[,c("FL1-H", "FL2-H")]
  derived_col <- existing_subset[,"FL1-H"]/existing_subset[,"FL2-H"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_1_2"
  fr_append_cols(fr, derived_col)
})

For cytoset:

cs <- lapply(cs, function(cf){
  existing_subset <- exprs(cf)[,c("FL1-H", "FL2-H")]
  derived_col <- existing_subset[,"FL1-H"]/existing_subset[,"FL2-H"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_1_2"
  cf_append_cols(cf, derived_col)
})
jacobpwagner commented 4 years ago
> head(exprs(fs[[1]]))
     FSC-H SSC-H       FL1-H       FL2-H     FL3-H FL2-A     FL4-H Time    ratio_1_2
[1,]   371   396 2432.871983  507.887297 18.156914   110 21.739192    1  4.790180805
[2,]   190    62    7.513726 1006.775298 26.982678   213  1.000000    1  0.007463160
[3,]   141   197    3.194470  597.239582  3.109343   132 29.524716    1  0.005348725
[4,]   167   265 1977.824185  143.998435  5.839470    28  4.579326    1 13.735039478
[5,]   128    30    1.321941    1.640793  5.632914     0  1.655632    1  0.805672162
[6,]   208    60    8.295949 1121.639700 10.868463   240  1.000000    1  0.007396269
steveb-123 commented 4 years ago

I'm checking out right now, but you might beat me to it since you're here: can this be performed after normalization?

jacobpwagner commented 4 years ago

Sure, that same logic I showed can be used at any point. You just won't have compensation/transformation associated with the derived channel you add until you add them. If the data is already contained in a GatingSet, you will just probably need to: 1) Pull the data out with gs_cyto_data 2) Add the derived columns as above 3) Replace the data with gs_cyto_data<-

I might also not be understanding your question exactly. What normalization step were you thinking as earlier you mention using this ratio to normalize one channel by another (so adding these columns is the normalization step in that case). It might be cleared up if I have time to read the papers in a bit.

steveb-123 commented 4 years ago

As an example in my own case, I have a large population of events/cells in every experiment which is essentially negative for both FL channels. For mystery technical reasons this -ve population can move around slight on an XY plot of those two channels between samples. It would be useful to warp the data pre- ratio calculation so that the 'origins' are aligned between samples.

Check out the purple arrow in the attached plot. It should be at some location as in the samples with black arrows. This is what I wish to normalise. My ratio analysis is between the two axes shown in this plot, note how the angle and shape differs between proteins.

Screenshot from 2020-07-02 19-32-17

edit: changed the facet label from 'channel' to less-confusing 'protein'

steveb-123 commented 4 years ago

Thats just to clarify what I was saying a little bit re: normalisation. I hope it also explains a little bit these ratio experiments and how the data look. I'm conscious of the topic drift here..

mikejiang commented 4 years ago

Following up on the easy wrapper for creating derived parameters, flowCore::transform does provide such functionality through the in-line expression

> data("GvHD")
> fs <- GvHD[1:2,3:4]
> fs <- transform(fs, ratio_1_2 = `FL1-H`/`FL2-H`)
> fs[[1]]  
flowFrame object 's5a01'
with 3420 cells and 3 observables:
          name                                           desc range minRange maxRange
$P3      FL1-H                                      CD15 FITC  1024        1    10000
$P4      FL2-H                                        CD45 PE  1024        1    10000
$P31 ratio_1_2 derived from transformation of FL1-H and FL2-H    NA        1        1
170 keywords are stored in the 'description' slot
> head(exprs(fs[[1]]))
           FL1-H       FL2-H    ratio_1_2
[1,] 2432.871983  507.887297  4.790180805
[2,]    7.513726 1006.775298  0.007463160
[3,]    3.194470  597.239582  0.005348725
[4,] 1977.824185  143.998435 13.735039478
[5,]    1.321941    1.640793  0.805672162
[6,]    8.295949 1121.639700  0.007396269

But it doesn't work with cytoset/cytoframe yet, so you have to stick to flowSet/flowFrame before we decide to extend it.

@jacobpwagner 's approach is more robust and flexible thus apparently worth to be adopted as a general workflow.

Also, this legacy feature of flowCore is sort of abusing the classical meaning of transform API (as @gfinak was getting at). So I am not sure if it should be really separated/renamed into another separate function.

jacobpwagner commented 4 years ago

Thanks for the reminder, Mike. I thought there was support for this within transform, but I somehow forgot about that in-line approach.

But yeah, maybe we shouldn't muddle the meaning of tranform and instead make a derive_channel API or something along those lines.

And @steveb-123 , yeah you could apply the ratio after the warping normalization. That should be no problem using the method above after the normalization is done. Is the idea that you would want to apply an openCyto gatingTemplate using that ratio derived parameter as well? What I'm getting at is so far nothing we have discussed really requires openCyto, just flowCore/flowWorkspace.

steveb-123 commented 4 years ago

@jacobpwagner it would be useful to be able to gate on these derived parameters. (If that is what you are asking.. I'm still not super familiar with all the structures and functions in these packages). Right now, I do not need to do that, but it may come up, and makes some sense that it should be possible.

This type of functionality exists in flow software already I believe (cytexpert from Beckman at least), so I and others would expect to find available somewhere in the cytoFlow extened family of packages probably.

jacobpwagner commented 4 years ago

Gating on it is no problem. You can add gates to any channel(s) via flowWorkspace methods (see flowWorkspace::gs_pop_add). So, you can certainly add gates that way.

openCyto is a package to support automated gating pipelines by plugging in automated gating methods. Instead of specifying gates explicitly, you specify a series of methods and arguments to determine gates.

You can read the vignettes for both packages to get more information. Most of the manual gating functionality of software like FlowJo (minus the GUI) is contained in flowWorkspace. https://dillonhammill.github.io/CytoExploreR/ is rapidly adding the GUI layer.

But, the short version is that yes, you can add your ratio channel and then gate on it like any other channel.

steveb-123 commented 4 years ago

@jacobpwagner This worked as expected last week:

fs <- fsApply(fs, function(fr){
  existing_subset <- exprs(fr)[,c("FL9-A", "FL1-A")]
  derived_col <- existing_subset[,"FL9-A"]/existing_subset[,"FL1-A"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_9_1"
  fr_append_cols(fr, derived_col)
})

But after setting up my original analysis based on flowset, I now have had to switch to cytoset, because the original method maxed out my memory when I upscaled from the original small test set, to my complete set of samples.

Now, if I run this:

cs <- lapply(cs, function(cf){
  existing_subset <- exprs(cf)[,c("FL9-A", "FL1-A")]
  derived_col <- existing_subset[,"FL9-A"]/existing_subset[,"FL1-A"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_9_1"
  cf_append_cols(cf, derived_col)
})

I cannot convert my cytoset to a gating set when I run gs <- GatingSet(cs). Instead I get the error:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘GatingSet’ for signature ‘"list", "missing"

Which I wonder if its is the result of the empty "desc" fields I see when I ask cs ?

steveb-123 commented 4 years ago

For completeness:

library("tidyverse")
library("ggcyto")
library("flowClust")
library("openCyto")
library("data.table")
library("flowWorkspace")
library("parallel")
library("flowStats")

setwd("~/Dropbox/c_elegans/flow/HEK293S-Flp-Trex-BG-EMC4-2020-01-21/test4/")

cs <- load_cytoset_from_fcs(path = "./full/")

## copy phenodata table to add some new columns by regex based on filename
pdata <- pData(cs)
head(pdata)
pdata <- gsub(pattern = "-Well|.fcs", replacement = "", x = pdata$name) %>%
  str_split_fixed(pattern = "-", n=2) %>%
  cbind(pdata, .) # Split (file)name field into well and plate
colnames(pdata)[2:3] <- c("plate", "well") # fix up colnames
head(pdata)

## import platemap and match up gpcrs
flowPlateMap <-
  read.csv2(file = "~/Dropbox/c_elegans/flow/shared-files/flowplatemap2.csv", header = TRUE, sep = ",")
pdata <- merge(pdata, flowPlateMap, by = "well")
head(pdata)

## assign genotypes based on plate
pdata$genotype <- NA
pdata$genotype[pdata$plate == "01"] <- "WT"
pdata$genotype[pdata$plate == "02"] <- "XX"
head(pdata)

## add valid now.names and assign as a pData
row.names(pdata) <- pdata$name # essential
pData(cs) <- pdata

## Make the description entries for the new columns in varMetadata table
newRows <- data.frame(
  "labelDescription" = str_to_title(colnames(pdata)),
  row.names = colnames(pdata)
)
cs@phenoData@varMetadata <- newRows

cs <- lapply(cs, function(cf){
  existing_subset <- exprs(cf)[,c("FL9-A", "FL1-A")]
  derived_col <- existing_subset[,"FL9-A"]/existing_subset[,"FL1-A"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_9_1"
  cf_append_cols(cf, derived_col)
})

gs <- GatingSet(cs)
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] flowStats_4.0.0           data.table_1.12.8        
 [3] openCyto_2.0.0            flowClust_3.26.0         
 [5] ggcyto_1.16.0             flowWorkspace_4.0.6      
 [7] ncdfFlow_2.35.1           BH_1.72.0-3              
 [9] RcppArmadillo_0.9.900.1.0 flowCore_2.0.1           
[11] forcats_0.5.0             stringr_1.4.0            
[13] dplyr_1.0.0               purrr_0.3.4              
[15] readr_1.3.1               tidyr_1.1.0              
[17] tibble_3.0.1              ggplot2_3.3.2            
[19] tidyverse_1.3.0          

loaded via a namespace (and not attached):
 [1] nlme_3.1-147        matrixStats_0.56.0  fs_1.4.2           
 [4] lubridate_1.7.9     RColorBrewer_1.1-2  httr_1.4.1         
 [7] Rgraphviz_2.32.0    tools_4.0.2         backports_1.1.8    
[10] R6_2.4.1            KernSmooth_2.23-17  DBI_1.1.0          
[13] BiocGenerics_0.34.0 colorspace_1.4-1    withr_2.2.0        
[16] tidyselect_1.1.0    gridExtra_2.3       mnormt_2.0.1       
[19] compiler_4.0.2      graph_1.66.0        cli_2.0.2          
[22] rvest_0.3.5         Biobase_2.48.0      xml2_1.3.2         
[25] scales_1.1.1        DEoptimR_1.0-8      robustbase_0.93-6  
[28] mvtnorm_1.1-1       hexbin_1.28.1       RBGL_1.64.0        
[31] digest_0.6.25       R.utils_2.9.2       rrcov_1.5-2        
[34] jpeg_0.1-8.1        pkgconfig_2.0.3     dbplyr_1.4.4       
[37] rlang_0.4.6         readxl_1.3.1        rstudioapi_0.11    
[40] generics_0.0.2      jsonlite_1.7.0      gtools_3.8.2       
[43] mclust_5.4.6        R.oo_1.23.0         magrittr_1.5       
[46] Matrix_1.2-18       RProtoBufLib_2.0.0  Rcpp_1.0.4.6       
[49] munsell_0.5.0       fansi_0.4.1         lifecycle_0.2.0    
[52] R.methodsS3_1.8.0   stringi_1.4.6       MASS_7.3-51.6      
[55] zlibbioc_1.34.0     plyr_1.8.6          grid_4.0.2         
[58] blob_1.2.1          crayon_1.3.4        lattice_0.20-41    
[61] splines_4.0.2       haven_2.3.1         hms_0.5.3          
[64] tmvnsim_1.0-2       pillar_1.4.4        fda_5.1.4          
[67] corpcor_1.6.9       stats4_4.0.2        reprex_0.3.0       
[70] XML_3.99-0.4        glue_1.4.1          latticeExtra_0.6-29
[73] RcppParallel_5.0.2  modelr_0.1.8        png_0.1-7          
[76] vctrs_0.3.1         cellranger_1.1.0    gtable_0.3.0       
[79] clue_0.3-57         assertthat_0.2.1    ks_1.11.7          
[82] broom_0.5.6         pcaPP_1.9-73        IDPmisc_1.1.20     
[85] flowViz_1.52.0      cytolib_2.0.3       ellipse_0.4.2      
[88] cluster_2.1.0       ellipsis_0.3.1     
>
jacobpwagner commented 4 years ago

Sorry, I forgot to mention. This will yield a list of cytoframe objects (so calling it cs was a little premature on my part):

cs <- lapply(cs, function(cf){
     existing_subset <- exprs(cf)[,c("FL1-H", "FL2-H")]
     derived_col <- existing_subset[,"FL1-H"]/existing_subset[,"FL2-H"]
     derived_col <- matrix(derived_col, ncol = 1)
     colnames(derived_col) <- "ratio_1_2"
     cf_append_cols(cf, derived_col)
 })

To make it a cytoset object, just coerce it using cytoset(), which you can then use in GatingSet:

cs <- cytoset(cs)
gs <- GatingSet(cs)
steveb-123 commented 4 years ago

Cool, thanks, that (mostly) worked. Curiously it breaks the "name" pData field:

> pData(cs)

                            name
01-Well-A1.fcs  file453374081b26
01-Well-A2.fcs  file45337bcf07e

Thankfully row.names are intact so its fixable manually.

Weird?

steveb-123 commented 4 years ago

Thats totally fair, I figured it would be fixed in a dedicated function.

Mostly just posted this incase someone else finds this and repeats.

Actually, as I think further.. is this the expected behaviour in cytoset()? Is that where these values are being created? If the row.names are preserved the actual filename info must be preserved in the split? Might not be important..

jacobpwagner commented 4 years ago

Deleting my prior comment because it was a little incorrect in this case. Let me look in to how this is coming about and get back to you. It's not coming from the simple cytoset() coercion, though. Actually, the first part of my comment is still fine and useful so I'll re-add that:

Yeah, this is sort of outside of the realm of normal workflow because we don't have a wrapper for this yet. I will think about how to make a nice general wrapper for this sort of thing, but in the meantime you can just save the pData before you split it and then re-assign it:

pd <- pData(cs)
cs <- cytoset(lapply(cs, function(cf){
  existing_subset <- exprs(cf)[,c("FL1-H", "FL2-H")]
  derived_col <- existing_subset[,"FL1-H"]/existing_subset[,"FL2-H"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_1_2"
  cf_append_cols(cf, derived_col)
}))
pData(cs) <- pd
jacobpwagner commented 4 years ago

Okay, I figured it out. These are mostly notes to myself about how to fix this, so don't worry about reading it all. That is coming about because of cf_append_cols. For cytoframes of certain backend types it is still using the older logic of going through a flowFrame intermediate: https://github.com/RGLab/flowWorkspace/blob/ff4a00642d687b86b59b25476c2b1ac01d42c54a/R/cytoframe.R#L969-L972

It does that by coercing to a flowFrame, adding the columns to the flowFrame, writing out to a temporary FCS file, and then reading back in to a cytoframe: https://github.com/RGLab/flowWorkspace/blob/ff4a00642d687b86b59b25476c2b1ac01d42c54a/R/cytoframe.R#L703-L705

That's where those weird filenames are coming from. We recently updated cf_append_cols for in-memory cytoframes in part to avoid this cytoframe->flowFrame->fr_append_cols->FCS->cytoframe process by directly modifying the C-level cytoframe structure: https://github.com/RGLab/cytolib/commit/b7dd39e4cde275770593c4524908ae06cd500bc9 https://github.com/RGLab/flowWorkspace/commit/e87fbdfa3508a44f14ece397130944213713ee03

We just haven't gotten around to doing it for h5 or tiledb cytoframes yet. This explains why I wasn't seeing a problem when I just loaded in a cytoset directly. For example, this has the filename issue because it uses the h5 backend and still hits the old cf_append_cols logic:

> cs <- load_cytoset_from_fcs(list.files(system.file("extdata", package = "flowWorkspaceData"), pattern = "CytoTrol", full.names = TRUE), backend = "h5")
> pData(cs)
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
> cs <- cytoset(lapply(cs, function(cf){
+     existing_subset <- exprs(cf)[,c("B710-A", "R660-A")]
+     derived_col <- existing_subset[,"B710-A"]/existing_subset[,"R660-A"]
+     derived_col <- matrix(derived_col, ncol = 1)
+     colnames(derived_col) <- "ratio_B710_R660"
+     cf_append_cols(cf, derived_col)
+ }))
> pData(cs)
                                    name
CytoTrol_CytoTrol_1.fcs file131040c154f6
CytoTrol_CytoTrol_2.fcs file1310536b74b5

While this does not because it hits the in-memory cytoframe logic (https://github.com/RGLab/flowWorkspace/blob/ff4a00642d687b86b59b25476c2b1ac01d42c54a/R/cytoframe.R#L975-L976):

> cs <- load_cytoset_from_fcs(list.files(system.file("extdata", package = "flowWorkspaceData"), pattern = "CytoTrol", full.names = TRUE), backend = "mem")
> pData(cs)
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
> cs <- cytoset(lapply(cs, function(cf){
+     existing_subset <- exprs(cf)[,c("B710-A", "R660-A")]
+     derived_col <- existing_subset[,"B710-A"]/existing_subset[,"R660-A"]
+     derived_col <- matrix(derived_col, ncol = 1)
+     colnames(derived_col) <- "ratio_B710_R660"
+     cf_append_cols(cf, derived_col)
+ }))
> pData(cs)
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
steveb-123 commented 4 years ago

Oh dear - sorry i'm dragging this on. The following

pdata <- pData(cs)
cs <- cytoset(lapply(cs, function(cf){
  existing_subset <- exprs(cf)[,c("FL9-A", "FL1-A")]
  derived_col <- existing_subset[,"FL9-A"]/existing_subset[,"FL1-A"]
  derived_col <- matrix(derived_col, ncol = 1)
  colnames(derived_col) <- "ratio_9_1"
  cf_append_cols(cf, derived_col)
}))
pData(cs) <- pdata

worked out perfectly for 20 .fcs files, but it crashes my R process when I try to run over my full set (160 files, ~3gb total). This was on a 16gb memory computer. Is it likely just an R/memory hardware limitation?

mikejiang commented 4 years ago

check the output of cf_get_uri(cs[[1]]) to see if it is in-memory or on-disk.

jacobpwagner commented 4 years ago

After https://github.com/RGLab/flowWorkspace/commit/cd8f9f95fd51aaa6c20dba0c8cf480b52e51e493, the pData should be preserved for all cytoframe backends because they all direct through the in-memory cytoframe method before being converted back to their original backend format:

> for(backend in c("mem", "tile", "h5")){
+   cs <- load_cytoset_from_fcs(list.files(system.file("extdata", package = "flowWorkspaceData"), pattern = "CytoTrol", full.names = TRUE), backend = backend)
+   print("**********")
+   print(paste0("** ", backend, " Pre**"))
+   print("Backend:")
+   print(flowWorkspace:::cf_backend_type(cs[[1]]))
+   print("pData:")
+   print(pData(cs))
+   print("exprs:")
+   print(head(exprs(cs[[1]])))
+   
+   cs <- cytoset(lapply(cs, function(cf){
+     existing_subset <- exprs(cf)[,c("B710-A", "R660-A")]
+     derived_col <- existing_subset[,"B710-A"]/existing_subset[,"R660-A"]
+     derived_col <- matrix(derived_col, ncol = 1)
+     colnames(derived_col) <- "ratio_B710_R660"
+     cf_append_cols(cf, derived_col)
+   }))
+   print(paste0("** ", backend, " Post**"))
+   print("Backend:")
+   print(flowWorkspace:::cf_backend_type(cs[[1]]))
+   print("pData:")
+   print(pData(cs))
+   print("exprs:")
+   print(head(exprs(cs[[1]])))
+   print("**********")
+ }
[1] "**********"
[1] "** mem Pre**"
[1] "Backend:"
[1] "mem"
[1] "pData:"
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
[1] "exprs:"
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7
[1] "** mem Post**"
[1] "Backend:"
[1] "mem"
[1] "pData:"
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
[1] "exprs:"
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time ratio_B710_R660
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2      0.62714178
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4      0.01125176
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6      0.43596910
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6      0.26345332
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7      1.11983732
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7     13.10914649
[1] "**********"
[1] "**********"
[1] "** tile Pre**"
[1] "Backend:"
[1] "tile"
[1] "pData:"
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
[1] "exprs:"
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7
overwriting the existing folder /tmp/f2b66c2b-9f4f-4a56-83a7-e59bd58f97e3.tile!
overwriting the existing folder /tmp/c256d0d0-a761-41f3-8825-ef2585a73392.tile!
[1] "** tile Post**"
[1] "Backend:"
[1] "tile"
[1] "pData:"
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
[1] "exprs:"
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time ratio_B710_R660
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2      0.62714177
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4      0.01125176
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6      0.43596908
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6      0.26345333
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7      1.11983728
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7     13.10914612
[1] "**********"
[1] "**********"
[1] "** h5 Pre**"
[1] "Backend:"
[1] "h5"
[1] "pData:"
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
[1] "exprs:"
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7
[1] "** h5 Post**"
[1] "Backend:"
[1] "h5"
[1] "pData:"
                                           name
CytoTrol_CytoTrol_1.fcs CytoTrol_CytoTrol_1.fcs
CytoTrol_CytoTrol_2.fcs CytoTrol_CytoTrol_2.fcs
[1] "exprs:"
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time ratio_B710_R660
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2      0.62714177
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4      0.01125176
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6      0.43596908
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6      0.26345333
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7      1.11983728
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7     13.10914612
[1] "**********"
DillonHammill commented 4 years ago

This is great @jacobpwagner!

jacobpwagner commented 4 years ago

Thanks. Just to be clear, this still doesn't deal with the underlying pData representation issue of https://github.com/RGLab/flowWorkspace/issues/297. We still have to loop back to that at some point. But this does overlap heavily with https://github.com/RGLab/flowWorkspace/issues/321.

DillonHammill commented 4 years ago

Yes, just glad to see that you are working on this as it is widely used within CytoExploreR.

steveb-123 commented 4 years ago

@mikejiang weird.. I don;t havw that function

> library("tidyverse")
+ library("ggcyto")
+ library("flowClust")
+ library("openCyto")
+ library("data.table")
+ library("flowWorkspace")
+ library("parallel")
+ library("flowStats")

> cf_get_uri(cs[[1]])
Error in cf_get_uri(cs[[1]]) : could not find function "cf_get_uri"
jacobpwagner commented 4 years ago

@steveb-123, I believe that came over to the master branch when the branch containing commits to add the TileDB backend was merged fairly recently. You may just need to update cytolib and flowWorkspace. If you give the output of sessionInfo() I can check to make sure. Otherwise:

devtools::install_github("RGLab/cytolib")
devtools::install_github("RGLab/flowCore")
devtools::install_github("RGLab/flowWorkspace")

And then try again. Alternatively, you could leave it to the cytoverse package:

remotes::install_github("RGLab/cytoverse")
cytoverse::cytoverse_update(repo="github")