RGLab / CytoML

A GatingML Interface for Cross Platform Cytometry Data Sharing
GNU Affero General Public License v3.0
29 stars 14 forks source link

gatingset_to_flowjo Docker issue #108

Closed Sithara85 closed 4 years ago

Sithara85 commented 4 years ago

Hi,

I am trying to create a FlowJo xml .wsp file from a gating set in R. I had issues with docker and the cytolib/CytoML versions and fixed most of the issues as advised in the ticket - https://github.com/RGLab/CytoML/issues/97

Please see attached error I am getting when I use the sample gating set from CytoMl documentation - https://rdrr.io/bioc/CytoML/man/gatingset_to_flowjo.html.

path <- system.file("extdata",package="flowWorkspaceData")

gs_path <- list.files(path, pattern = "gs_manual",full = TRUE) gs <- load_gs(gs_path)

output to flowJo

outFile <- tempfile(fileext = ".wsp") gatingset_to_flowjo(gs, outFile) INFO: Could not find files for the given pattern(s). Error in gatingset_to_flowjo(gs, outFile) :   'docker' is not running properly! 

This is the first time I am using Docker image. I used admin privilege in Powershell to pull the docker image below:

PS C:\Program Files\Docker\Docker> docker pull rglab/gs-to-flowjo:devel devel: Pulling from rglab/gs-to-flowjo df20fa9351a1: Pull complete                                                                                                                                                                  df33d3a3b5af: Pull complete                                                                                                                                                                  d8b7beb12242: Pull complete                                                                                                                                                                  a4e3f3e3cbb1: Pull complete                                                                                                                                                                  Digest: sha256:5239ba3089a19f50dd985d8b0b29048020feeb16fe808c55a6bb396b9e75748c Status: Downloaded newer image for rglab/gs-to-flowjo:devel docker.io/rglab/gs-to-flowjo:devel

My R sessionInfo() is below to verify package versions:

R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

Random number generation:  RNG:     Mersenne-Twister  Normal:  Inversion  Sample:  Rounding   locale: [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages: [1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:  [1] BioInstaller_0.3.7        cytolib_2.1.12            CytoML_2.1.9              scales_1.1.1              ClusterR_1.2.2            [6] gtools_3.8.2              ggcyto_1.17.0             ncdfFlow_2.35.1           BH_1.72.0-3               RcppArmadillo_0.9.900.1.0 [11] ggplot2_3.3.2             flowWorkspace_4.1.7       flowCore_2.1.1            openCyto_2.1.2           

jacobpwagner commented 4 years ago

From Powershell (without admin privileges), can you run this and let me know the output?:

docker run rglab/gs-to-flowjo:devel --cytolib-version

Also just a simple

docker info
jacobpwagner commented 4 years ago

Also, you may have already figured this out, but due to the fact that you are Windows 8, you will probably need to use the legacy Docker ToolBox (https://docs.docker.com/toolbox/toolbox_install_windows/) rather than the standard Docker Desktop (https://docs.docker.com/docker-for-windows/install/#system-requirements)

Sithara85 commented 4 years ago

Thank you very much for your quick response!

Without admin privilege I get:

docker run rglab/gs-to-flowjo:devel --cytolib-version

C:\Program Files\Docker\Docker\resources\bin\docker.exe: error during connect: Post http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/containers/create: open //./pipe/docker_engine: Access is denied. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.See 'C:\Program Files\Docker\Docker\resources\bin\docker.exe run --help'.

docker info Client: Debug Mode: false

Server: ERROR: error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/info: open //./pipe/docker_engine: Access is denied. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.

Looks like I got docker working only with admin privilege, that's why in R it's not running fine.

I have installed docker desktop. I will try Docker toolbox and see. Thank you! errors pretty printing info

jacobpwagner commented 4 years ago

You can also try just making sure that Docker Desktop is running. If it's not, you will get the docker daemon error. If it installed, they may have already covered the legacy issue.

Sithara85 commented 4 years ago

I think the issue is my account is not added to docker user group member list. In Windows 10, I don't have access to add myself as member.. I will update here if everything works fine after docker running fine.

Sithara85 commented 4 years ago

I wanted to to quickly check with you if there is any other way I can utilize CytoML/ cytolib in R other than the docker image pull. Our system administration having some problem adding me to the docker user group and I have an urgent request to provide the .wsp file from R check the gating method.

jacobpwagner commented 4 years ago

You can definitely use CytoML and its dependencies (including cytolib) without Docker. Just install them from Bioconductor or GitHub. That will allow you to parse in FlowJo workspaces and analyze them using the full set of tools. The only functionality that requires Docker is writing a GatingSet class out to a FlowJo workspace because that is the one part of our packages that is currently closed-source. If that's the functionality that you need, you will need Docker as we do not currently build and distribute binaries of that executable for a range of platforms (hence the Docker solution).

Sithara85 commented 4 years ago

Yeah I need CytoML to produce a workspace file from openCyto gating set. I will work with our IT to get Docker running. Thanks for your inputs.

Sithara85 commented 4 years ago

Hi Jake,

docker run rglab/gs-to-flowjo:devel --cytolib-version --this shows 2.1.14 but I have got 2.1.12 in R. So I got below error when I used gatingset_to_flowjo? Could you suggest what's best to do in this case?

Using docker image rglab/gs-to-flowjo:devel to write FlowJo workspace... Warning message: In gatingset_to_flowjo(gs, outFile) : docker image 'rglab/gs-to-flowjo:devel' is built with different cytolib version of from R package: 2.1.14 vs 2.1.12

jacobpwagner commented 4 years ago

Your cytolib is ahead of Bioconductor, so presumably you installed from GitHub. The gs-to-flowjo:devel image stays up-to-date with every cytolib commit to GitHub, so you'll need to update your branches to the current GitHub branches. The cytoverse package is meant to make this a little easier as it will check all of them for you:

devtools::install_github("RGLab/cytoverse")
cytoverse::cytoverse_update(repo="github")

But once you pull the gs-to-flowjo image, you won't need to (and shouldn't) re-pull it unless you update your packages from GitHub again.

Sithara85 commented 4 years ago

Thanks Jake! I have installed the cytoverse and fixed most of the errors with colortable and vctrs. Now I am stuck with installing CytoML. I have removed the previous installation but getting below error.

compilation terminated. make: *** [C:/Users/svivek/DOCUME~1/R/R-40~1.2/etc/i386/Makeconf:229: RcppExports.o] Error 1 ERROR: compilation failed for package 'CytoML'

Sithara85 commented 4 years ago

I tried installing directly from github using but got same error as cytoverse. Could you please check?

remotes::install_github("RGLab/CytoML")

ERROR:

Downloading GitHub repo RGLab/CytoML@master Skipping 14 packages ahead of CRAN: openCyto, RBGL, Rgraphviz, Biobase, graph, ggcyto, RProtoBufLib, Rhdf5lib, BiocGenerics, zlibbioc, ncdfFlow, flowViz, flowStats, flowClust √ checking for file 'C:\Users\svivek\AppData\Local\Temp\RtmpULsJq6\remotes10546ab6271b\RGLab-CytoML-1d8646e/DESCRIPTION' ...

** libs

*** arch - i386 "C:/rtools40/mingw32/bin/"g++ -std=gnu++11 -I"C:/Users/svivek/DOCUME~1/R/R-40~1.2/include" -DNDEBUG -DROUT -I../inst/include/ -I//i386/include/libxml2 -DLIBXML_STATIC -fpermissive -DRCPP_PARALLEL_USE_TBB=1 -I'C:/Users/svivek/Documents/R/R-4.0.2/library/Rcpp/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/BH/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/RProtoBufLib/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/cytolib/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/Rhdf5lib/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/RcppArmadillo/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/RcppParallel/include' -I'C:/Users/svivek/Documents/R/R-4.0.2/library/flowWorkspace/include' -O2 -Wall -mfpmath=sse -msse2 -mstackrealign -c RcppExports.cpp -o RcppExports.o In file included from ../inst/include/CytoML/workspace_type.hpp:4, from ../inst/include/CytoML/workspace.hpp:10, from ../inst/include/CytoML/flowJoWorkspace.hpp:11, from ../inst/include/CytoML/macFlowJoWorkspace.hpp:10, from ../inst/include/CytoML/openWorkspace.hpp:11, from ../inst/include/CytoML.h:5, from RcppExports.cpp:4: ../inst/include/CytoML/wsNode.hpp:10:10: fatal error: libxml/tree.h: No such file or directory

include <libxml/tree.h>

      ^~~~~~~~~~~~~~~

compilation terminated. make: *** [C:/Users/svivek/DOCUME~1/R/R-40~1.2/etc/i386/Makeconf:229: RcppExports.o] Error 1 ERROR: compilation failed for package 'CytoML'

Sithara85 commented 4 years ago

BTW my cytolib is now 2.1.14 Hurray! I step closer :)

Sithara85 commented 4 years ago

I will try the fix you mentioned in ticket CytoML #65 for libxml2 dependency. Thanks!

jacobpwagner commented 4 years ago

Sorry for the delay. Yep, adding libxml2 was going to be my advice. Thanks for checking the prior issues. One of the next steps for that installer package is guiding you through the external dependencies so you don't have to hunt through GitHub issues: https://github.com/RGLab/cytoverse/issues/3

Sithara85 commented 4 years ago

Looks like installing libxml2 didn't solve it. I couldn't find it in bioconductor so installed it from Github, tried both ways. Am I missing something with libxml2?

devtools::install_github("r-lib/xml2") install.packages("https://github.com/hadley/xml2/archive/master.tar.gz", type = "source", repos = NULL)

Still get ---- fatal error: libxml/tree.h: No such file or directory

jacobpwagner commented 4 years ago

Yeah, libxml2 is a system library, not an R package. I see you're using Rtools 4.0 (https://cran.r-project.org/bin/windows/Rtools/). There is a new package manager for installing these system libraries: https://github.com/r-windows/docs/blob/master/rtools40.md#readme.

Follow the instructions there to start Rtools Bash and then run this (and approve the installation):

pacman -S mingw-w64-{i686,x86_64}-libxml2

Then try again. By the way, this is all only necessary because you're building from source from GitHub. I just moved over updates to Bioconductor, which will also make the newest binaries available soon via BiocManager::install(). That said, this is sort of the last hurdle you need to go over to build everything from source, so you're almost there.

Sithara85 commented 4 years ago

Thank you! That would be great if user can get all the necessary packages from BiocManager::install(). But now I am working on a time sensitive project and need to make this as early as possible for data release.

I tried the pacman code in Rools bash and ran below code in R, Now I am getting new error. Sys.setenv(LOCAL_CPPFLAGS = "-I$(MINGW_PREFIX)/include/libxml2") install.packages("XML", type = "source")

cytoverse::cytoverse_update(repo="github")

Error: g++.exe: error: //i386/lib/libxml2.a: No such file or directory no DLL was created ERROR: compilation failed for package 'CytoML'

jacobpwagner commented 4 years ago

Can you install XML not from source (just from the binaries available from CRAN)? Try this from a fresh R session (for now don't tinker with LOCAL_CPPFLAGS):

install.packages("XML")
devtools::install_github("RGLab/CytoML")

And let me know if you hit an error and the full error output.

Sithara85 commented 4 years ago

Yup I was trying that found in the ticket - https://github.com/r-windows/checks/issues/5#issue-335598042.

I very much appreciate your prompt responses.

Sithara85 commented 4 years ago

Jake,

That didn't work. I am getting same error. I feel sorry to post this many errors.

C:/rtools40/mingw32/bin/g++ -shared -s -static-libgcc -o CytoML.dll tmp.def RcppExports.o parseFlowJoWorkspace.o -LC:/Users/svivek/DOCUME~1/R/R-40~1.2/bin/i386 -lRlapack -LC:/Users/svivek/DOCUME~1/R/R-40~1.2/bin/i386 -lRblas -lgfortran -lm -lquadmath //i386/lib/libxml2.a C:/Users/svivek/Documents/R/R-4.0.2/library/cytolib/lib/i386/libcytolib.a C:/Users/svivek/Documents/R/R-4.0.2/library/RProtoBufLib/lib/i386/libprotobuf-lite.a -LC:/Users/svivek/Documents/R/R-4.0.2/library/RcppParallel/lib/i386 -ltbb -ltbbmalloc -LC:/Users/svivek/DOCUME~1/R/R-40~1.2/library/Rhdf5lib/lib/i386 -lhdf5_cpp -lhdf5 -lcurl -lssh2 -lssl -lcrypto -lwldap32 -lws2_32 -lcrypt32 -lszip -lz -lpsapi -lws2_32 -LC:/Users/svivek/DOCUME~1/R/R-40~1.2/bin/i386 -lR g++.exe: error: //i386/lib/libxml2.a: No such file or directory no DLL was created ERROR: compilation failed for package 'CytoML'

Sithara85 commented 4 years ago

I did as detailed in the ticket https://github.com/igraph/igraph/issues/915

copied over the libxml2 files from local323.zip. Taking R-devel as the root of my installed development R, I copied:

the whole local323.zip/include/libxml2/libxml folder to R-devel/include/libxml the files in local323.zip/lib/i386 into R-devel/lib/i386/ the files in local323.zip/lib/x64 into R-devel/lib/x64/

Now I can see the file libxml2.a in R bin but get the same error while installing CytoML saying the file is not found.

image

jacobpwagner commented 4 years ago

Hey @Sithara85 . Sorry I was out for a bit. So, the XML package installation succeeded, but now it's just CytoML? Or is XML still failing?

jacobpwagner commented 4 years ago

Hey @Sithara85. I'm sorry I didn't jump to this sooner. I was just bouncing between things earlier. This libxml2 issue has been a recurring problem for Windows users, so we actually have pre-built binaries available. So, follow these instructions:

1) Go here: http://rglab.github.io/binaries/ and download the libxml2 zip file. 2) Unzip it anywhere, but keep track of the path. For example purposes, I'll say you just extract it at the root of the C drive so now you have C:\libxml2 3) Set the LIB_XML2 environment variable to that path.

That should work. Let me know if you still run in to issues.

Sithara85 commented 4 years ago

Thank you Jake! libxml2.a file issue to install CytoML is resolved now, giving the path LIB_XML2 to environment variable helped. Now I will try the gatingset_to_flowjo. Thank you for building the binaries.

Sithara85 commented 4 years ago

Thank you very much Jake for all the support provided!

Sithara85 commented 4 years ago

Hi Jake, Hope you are keeping well!

I was in the process of implementing the openCyto gating on a list of ~10,000 samples we have. I have to implement this in our super computer system, where docker image is invoked using a singularity container. This is the first time I had to use singularity. Now I have docker image available in singularity as 'gs-to-flowjo_devel.sif'. Inside this container I can see gs_to_flowjo function but not gatingset_to_flowjo. It's usage shows as below:

./gs-to-flowjo --help usage: gs-to-flowjo [--version] [--cytolib-version] [--help] [--showHidden=no]

src GatingSet archive directory dest output flowjo wsp file path showHidden whether to export the hidden populations

Does this mean I have to import the openCyto gates wsp files to the src directory and run this gs_to_flowjo function to convert the openCyto wsp file to FlowJo xml file? I was hoping to implement the whole process as an Rscript. Could you advise what's the best way to invoke this function in singularity container?

Thank you, Sithara

mikejiang commented 4 years ago

See Usage section of https://hub.docker.com/r/rglab/gs-to-flowjo if you want to run it as a docker container command, which requires you to mount your host directory as volumes to container.

If you are going to run the gs-to-flowjo command inside the container, here is the help print

./gs-to-flowjo --help
usage: gs-to-flowjo [--version] [--cytolib-version] [--help] <src> <dest> [--showHidden=no]

src         GatingSet archive directory
dest            output flowjo wsp file path
showHidden      whether to export the hidden populations

it simply asks for the gatingset folder as the first argument (i.e. src) and output wsp file name as the second argument (i.e. dest), i.e., clipping out the docker volume mounting part from the instruction above

./gs-to-flowjo -src=<your_gs_dir> --dest=<your_output_wsp>

Note that it only does gatingset --> wsp, not wsp --> xml

jacobpwagner commented 4 years ago

You should also be able to run it using singularity run, using --bind or -B to mount (bind) the local paths instead of the -v option for Docker's volume mounting. The /gs and /out paths will need to be created, so your singularity.conf will need to have enable overlay = yes.

Or you can shell in to the container and then run the command. But either way you will need to bind mount the GatingSet and output paths as Mike mentioned.

jacobpwagner commented 4 years ago

@Sithara85 , it took some tinkering because the runscript created by singularity was getting the pathing a little off for the entrypoint. So, instead of singularity run, I switched it to singularity exec and manually re-directed it:

sudo singularity exec --bind gs_manual:/gs --bind out:/out docker://rglab/gs-to-flowjo:devel /gs-to-flowjo --src=/gs --dest=/out/converted.wsp

This example is launched from the local directory containing gs_manual and out just to make the paths simpler for demonstration. It's binding the local gs_manual to /gs in the container and the local out to /out in the container, then running the gs-to-flowjo command from within the container. The sudo is just a bit of a lazy workaround to get it to allow me to create the /gs and /out directories in the container. I'm guessing you don't have sudo privileges on your supercomputer system, but I'm sure there is a way to get this to work within the permissions options of singularity.conf. So I'd talk to your admin there. But the core of the command is there above. You'll just need to re-work the local paths and point to the appropriate image (for example you may want docker://rglab/gs-to-flowjo:2.0 for Bioconductor release 3.11.

jacobpwagner commented 4 years ago

Alternatively, if it is way too much of a pain to get the enable overlays permissions working, we could probably pre-create the /gs and /out directories in the image to avoid the need for their creation.

Sithara85 commented 4 years ago

Hi Jake and Mike,

We actually made it working yesterday in the container using the function gs-to-flowjo using exec. All systems are down for monthly maintenance so I couldn’t check the results yet. But I am surprised to see Mikes comment that this function won’t create a FlowJo xml file.

Once the system is up I will respond to you for help.

Thank you, Sithara

Sent from my iPhone

On Aug 5, 2020, at 1:08 PM, Jake Wagner notifications@github.com wrote:

 Alternatively, if it is way too much of a pain to get the enable overlays permissions working, we could probably pre-create the /gs and /out directories in the image to avoid the need for their creation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jacobpwagner commented 4 years ago

To be clear, the output wsp file is XML that defines the FlowJo workspace, which I'm guessing is what you want. You asked:

Does this mean I have to import the openCyto gates wsp files to the src directory and run this gs_to_flowjo function to convert the openCyto wsp file to FlowJo xml file?

This doesn't really make sense, as openCyto doesn't really need to play a part in this step at all. The input is a saved flowWorkspace::GatingSet (the directory that results from a save_gs call), which has all of the information about populations, gates, gating tree, etc. The output is a FlowJo workspace file (a .wsp whose file format is XML).

However, if you are talking about going straight from an openCyto::gatingTemplate to a FlowJo wsp without a GatingSet intermediate, that is not supported because an openCyto::gatingTemplate doesn't actually define gates, but rather automated approaches to determining gates.

But I guess I might not be fully understanding your question/request, so feel free to elaborate and I can help clarify.

jacobpwagner commented 4 years ago

@Sithara85 , the image has been updated to avoid the need for exec and any extra privileges to create those directories. So this command form should also now just work (after pulling the updated image). Of course, you could still use rglab/gs-to-flowjo:2.0 instead of rglab/gs-to-flowjo:devel depending on what version of the toolchain you're using.

singularity run -B <YOUR_GS_PATH>:/gs -B <YOUR_OUTPUT_PATH>:/out docker://rglab/gs-to-flowjo:devel --src=/gs --dest=/out/converted.wsp

With my example from before

singularity run -B gs_manual:/gs -B out:/out docker://rglab/gs-to-flowjo:devel --src=/gs --dest=/out/converted.wsp

The README has also been updated to reflect this.

Sithara85 commented 4 years ago

Hi Jake,

Thank you so much for spending this much time to improve the usability. I think now my main problem is as you explained in the previous comment. I have the below loop function I was using with gatingset_to_flowjo function it created a .wsp FlowJo XML file. Does the gs-to-flowjo binary function works differently? I tried to convert the openCyto gating template to a gatingset as singularity throw an error when I ran it without save_gs. But when I do save_gs it won't create proper gating set objects, as you have correctly pointed out.

I am really stuck. I had all this working in my R system but when I had to use the singularity container I have to add a step to export the gating set to this src directory so gs-to-flowjo can run the commands to create a wsp file.

Please see below the code I am using, kindly ignore the singularity part now. First I want to learn how to create a proper gs object after opencyto new CD28 gate method been added to the gating set. To give you a brief idea, 2-3 years aan old team implemented the openCyto framework to gate T cell populations and used Kmeans clustering for subsets of cytotoxic and helper T cells as they used 3 or more markers to define the population. Now I am trying to regate the Effector memory populationusing by adding mindensity of CD28. Kmeans output a boolean matrix and I try to create a subframe

fcsfiles <- list.files(fcsDir,pattern = ".fcs$",full.names = TRUE,recursive = TRUE) for (f in (fcsfiles)) { print (f) flowFrame <- read.FCS(f) print (summary(flowFrame)) flowSet <- read.flowSet(f) gs <- GatingSet(flowSet) matrix <- read.delim(list.files(KmeansDir,pattern = basename(f),full.names = TRUE,recursive = TRUE))
Helper_EM = matrix$effector.memory & matrix$HELPER_T

subFrame = gh_pop_get_data(gs)[Helper_EM, ]

summary(subFrame)

gs_EM3 <- GatingSet(as(subFrame, "flowSet")) template = gs_add_gating_method( gs_EM3, alias = "CD28Gate", pop = "+/-", parent = "root", dims = "CD28", gating_method = "mindensity", gating_args = "gate_range = c(1000,15000)" ) print(gs_pop_get_count_fast(gs_EM3)) fcs_file <- basename(f)

outputRoot = paste0(outputDir, fcs_file) addedWSP = paste0(outputRoot,"_gsEM3") addedWSP = gsub(" ", "-", addedWSP)

gs <- GatingSet(gs_EM3)

save_gs(gs_EM3,path=addedWSP )

print(paste0("adding and renaming nodes and writing to ", addedWSP))

system2("singularity", paste0("exec ./gs-to-flowjo_devel.sif ./test.sh"))

print (file.exists(addedWSP)) }

Sithara85 commented 4 years ago

Sorry for the delay to respond to your comment. Please take a look at your earliest convenience.

Thank you very much! Sithara

Sent from my iPhone

On Aug 5, 2020, at 4:50 PM, Jake Wagner notifications@github.com wrote:

 @Sithara85 , the image has been updated to avoid the need for exec and any extra privileges to create those directories. So this command form should also now just work. Of course, you could still use rglab/gs-to-flowjo:RELEASE_3_11 instead of rglab/gs-to-flowjo:devel depending on what version of the toolchain you're using.

singularity run -B :/gs -B :/out docker://rglab/gs-to-flowjo:devel --src=/gs --dest=/out/converted.wsp With my example from before

singularity run -B gs_manual:/gs -B out:/out docker://rglab/gs-to-flowjo:devel --src=/gs --dest=/out/converted.wsp — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jacobpwagner commented 4 years ago

@Sithara85 , hopefully this sketch helps. I don't have access to your data. But I did my best to mimic your steps. For the Kmeans I just made it a random boolean vector to mimic your matrix$effector.memory & matrix$HELPER_T

library(flowCore)
library(flowWorkspace)
library(CytoML)
library(openCyto)

# I'm using these to mimic the fcs files from your fcsDir
fcsDir <- system.file("extdata", package = "flowWorkspaceData")
fcs_files <- list.files(fcsDir, pattern = "CytoTrol", full.names = TRUE)

cs <- load_cytoset_from_fcs(fcs_files)

# Filter each of the cytoframes in the cytoset using the kmeans results. You could add these as
# a "gate" by just passing the logical vectors to gs_pop_add, but I'm trying to directly adapt your
# workflow of pre-filtering before gating.
cs_filtered <- lapply(cs, function(cf){
  # These random boolean sample vectors I'm making are just mimicking your kmeans output
  # you would be getting from reading the appropriate matrices from KmeansDir:
  #
  # matrix <- read.delim(list.files(KmeansDir,pattern = basename(f),full.names = TRUE,recursive = TRUE))
  # Helper_EM = matrix$effector.memory & matrix$HELPER_T

  dummy_boolean_vec = rep(FALSE, nrow(cf))
  # Just making some arbitrary entries TRUE...
  dummy_boolean_vec[sample(nrow(cf), 5042)] <- TRUE

  # But here, once you have your kmeans boolean vector, just use it to filter the cytoframe
  realize_view(cf[dummy_boolean_vec,])
})
cs_filtered <- cytoset(cs_filtered)

# You can also save these out as filtered FCS files if you want them for use in FlowJo
temp_cs_path <- tempfile()
# You could also loop through calling write.FCS on each cytoframe
write.flowSet(cs_filtered, temp_cs_path)

# Create a GatingSet from the pre-filtered cytoset
gs <- GatingSet(cs_filtered)

# Add the mindensity gate to the GatingSet
# I have to change this a little bit for my demo GatingSet here
gs_add_gating_method(
  gs,
  alias = "CD4Gate",
  pop = "+/-",
  parent = "root",
  dims = "CD4",
  gating_method = "mindensity",
  gating_args = "gate_range = c(4000,10000)"
)

# Now you can save this GatingSet out whereever and then use the image to write the corresponding FlowJo wsp file
tmp_gs_path <- tempfile()
tmp_wsp_path <- tempfile()
save_gs(gs, tmp_gs_path)

# You could do this from the command line using Docker or Singularity. You just need to make --src be the location
# you save_gs to and --dest can be whatever filename in whatever location you want.
gatingset_to_flowjo(gs, tmp_wsp_path)
Sithara85 commented 4 years ago

Jake,

You are awesome! Thank you so much for troubleshooting all the steps. So basically my save_gs didn’t export correctly because I assigned the mindensity to a template. Have you got a .nc, .dat and .rds files. I got a .gs file along with some other files, which were not proper gs archive directory.

I have a quick question before I try this again, we couldn’t find gatingset_to_flowjo in the singularity container. Are you saying gs-to-flowjo and gatingset_to_flowjo are same functions?

Thanks, Sithara Sent from my iPhone

On Aug 6, 2020, at 1:37 AM, Jake Wagner notifications@github.com wrote:

 @Sithara85 , hopefully this sketch helps. I don't have access to your data. But I did my best to mimic your steps. For the Kmeans I just made it a random boolean vector to mimic your matrix$effector.memory & matrix$HELPER_T

library(flowCore) library(flowWorkspace) library(CytoML) library(openCyto)

I'm using these to mimic the fcs files from your fcsDir

fcsDir <- system.file("extdata", package = "flowWorkspaceData") fcs_files <- list.files(fcsDir, pattern = "CytoTrol", full.names = TRUE)

cs <- load_cytoset_from_fcs(fcs_files)

Filter each of the cytoframes in the cytoset using the kmeans results. You could add these as

a "gate" by just passing the logical vectors to gs_pop_add, but I'm trying to directly adapt your

workflow of pre-filtering before gating.

cs_filtered <- lapply(cs, function(cf){

These random boolean sample vectors I'm making are just mimicking your kmeans output

you would be getting from reading the appropriate matrices from KmeansDir:

#

matrix <- read.delim(list.files(KmeansDir,pattern = basename(f),full.names = TRUE,recursive = TRUE))

Helper_EM = matrix$effector.memory & matrix$HELPER_T

dummy_boolean_vec = rep(FALSE, nrow(cf))

Just making some arbitrary entries TRUE...

dummy_boolean_vec[sample(nrow(cf), 5042)] <- TRUE

But here, once you have your kmeans boolean vector, just use it to filter the cytoframe

realize_view(cf[dummy_boolean_vec,]) }) cs_filtered <- cytoset(cs_filtered)

You can also save these out as filtered FCS files if you want them for use in FlowJo

temp_cs_path <- tempfile()

You could also loop through calling write.FCS on each cytoframe

write.flowSet(cs_filtered, temp_cs_path)

Create a GatingSet from the pre-filtered cytoset

gs <- GatingSet(cs_filtered)

Add the mindensity gate to the GatingSet

I have to change this a little bit for my demo GatingSet here

gs_add_gating_method( gs, alias = "CD4Gate", pop = "+/-", parent = "root", dims = "CD4", gating_method = "mindensity", gating_args = "gate_range = c(4000,10000)" )

Now you can save this GatingSet out whereever and then use the image to write the corresponding FlowJo wsp file

tmp_gs_path <- tempfile() tmp_wsp_path <- tempfile() save_gs(gs, tmp_gs_path)

You could do this from the command line using Docker or Singularity. You just need to make --src be the location

you save_gs to and --dest can be whatever filename in whatever location you want.

gatingset_to_flowjo(gs, tmp_wsp_path) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jacobpwagner commented 4 years ago

The storage format of GatingSet has changed a little recently. If you are working with the current branches, that GatingSet directory should have one .gs file and a .pb file and .h5 file for each sample. If you can successfully call load_gs on that directory to get another GatingSet, it is a proper GatingSet archive directory.

To your second question, CytoML::gatingset_to_flowjo just wraps up the command issued to Docker, and handles the volume mounting for you. So under the hood, they are both using gs-to-flowjo from that Docker image. CytoML::gatingset_to_flowjo is there for convenience to make it available in R by

1) Saving out the GatingSet to a directory if necessary (otherwise you can directly use the path where you already saved it): https://github.com/RGLab/CytoML/blob/9b820f37af805784d1bd7d224f63e415c2b1a26f/R/GatingSet2flowJo.R#L71-L80 2) Then yes, calling gs-to-flowjo from the Docker container: https://github.com/RGLab/CytoML/blob/9b820f37af805784d1bd7d224f63e415c2b1a26f/R/GatingSet2flowJo.R#L94-L102

So correct, you will not find gatingset_to_flowjo in the container, only gs-to-flowjo, which contains the core logic in a pre-compiled binary.

For further clarity, that final line I was adding in to be called from R:

gatingset_to_flowjo(gs, tmp_wsp_path)

is equivalent to either of these from a terminal (where <tmp_gs_path> and <tmp_wsp_path> are the input GatingSet and output wsp path respectively):

docker run -v <tmp_gs_path>:/gs -v <tmp_wsp_path>:/out rglab/gs-to-flowjo:devel --src=/gs --dest=/out
singularity run -B <tmp_gs_path>:/gs -B <tmp_wsp_path>:/out docker://rglab/gs-to-flowjo:devel --src=/gs --dest=/out

The gatingset_to_flowjo call will be completely equivalent to the terminal Docker call, but it sounds like you need to go with the terminal Singularity call due to security constraints.

Sithara85 commented 4 years ago

Great!! That explains the phenomenon I found with save_gs. Then I have to explore why singularity-src doesn’t recognize it as a proper gating set archive directory. I will try to load_gs and see.

Thank you so much for all the great details!!

Sithara

Sent from my iPhone

On Aug 6, 2020, at 10:32 AM, Jake Wagner notifications@github.com wrote:

 The storage format of GatingSet has changed a little recently. If you are working with the current branches, that GatingSet directory should have one .gs file and a .pb file and .h5 file for each sample. If you can successfully call load_gs on that directory to get another GatingSet, it is a proper GatingSet archive directory.

To your second question, CytoML::gatingset_to_flowjo just wraps up the command issued to Docker, and handles the volume mounting for you. So under the hood, they are both using gs-to-flowjo from that Docker image. CytoML::gatingset_to_flowjo is there for convenience to make it available in R by

Saving out the GatingSet to a directory if necessary (otherwise you can directly use the path where you already saved it): https://github.com/RGLab/CytoML/blob/9b820f37af805784d1bd7d224f63e415c2b1a26f/R/GatingSet2flowJo.R#L71-L80 Then yes, calling gs-to-flowjo from the Docker image: https://github.com/RGLab/CytoML/blob/9b820f37af805784d1bd7d224f63e415c2b1a26f/R/GatingSet2flowJo.R#L94-L102 So correct, you will not find gatingset_to_flowjo in the container, only gs-to-flowjo, which contains the core logic in a pre-compiled binary.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jacobpwagner commented 4 years ago

Yeah, if you could paste that code as well as any errors I can try to figure out what's wrong. Also remember that your gs-to-flowjo image needs to be in sync with your cytolib version. If you're using the Bioconductor 3.11 branches of the packages, you should be using rglab/gs-to-flowjo:2.0. If you are using the absolute most recent version of the packages from GitHub, then you should be use rglab/gs-to-flowjo:devel.

Sithara85 commented 4 years ago

Sure! I will load everything and rerun singularity.

Thanks you!

Sent from my iPhone

On Aug 6, 2020, at 10:41 AM, Jake Wagner notifications@github.com wrote:

 Yeah, if you could paste that code as well as any errors I can try to figure out what's wrong.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Sithara85 commented 4 years ago

I am able to load_gs on the src directory folders for each samples. So I just ran singularity function now and getting same error. I have three files in the src directoryas below.

image

LOAD_GS and GET DATA to confirm the .gs file exists----

gs <- load_gs("/panfs/roc/groups/15/thyagara/svivek/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3") gh_pop_get_data(gs) cytoframe object 'V1.fcs' with 17950 cells and 19 observables: name desc range minRange maxRange $P1 FSC-A 262143 0.00 262143 $P2 FSC-H 262143 0.00 262143 $P3 FSC-W 262143 0.00 262143 $P4 SSC-A 262143 0.00 262143 $P5 SSC-H 262143 0.00 262143 $P6 SSC-W 262143 0.00 262143 $P7 BB515-A CD27 262143 -111.00 262143 $P8 PE-A L/D 262143 -111.00 262143 $P9 PE-CF594-A HLA-DR 262143 -52.80 262143 $P10 PE-Cy7-A CD19 262143 -111.00 262143 $P11 BUV 395-A CD8 262143 -111.00 262143 $P12 BUV 737-A IgD 262143 -16.02 262143 $P13 APC-A CD3 262143 0.00 262143 $P14 BV 421-A CCR7 262143 0.00 262143 $P15 BV 510-A CD28 262143 0.00 262143 $P16 BV 605-A CD95 262143 0.00 262143 $P17 BV 711-A CD45RA 262143 -111.00 262143 $P18 APC-Cy7-A CD4 262143 0.00 262143 $P19 Time 262143 0.00 262143 280 keywords are stored in the 'description' slot cytoframe has been subsetted and can be realized through 'realize_view()'.

SINGULARITY ERROR:

Singularity> ./gs-to-flowjo --src="~/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3" - -dest="~/Singularity/dest" terminate called after throwing an instance of 'std::domain_error' what(): Not a valid GatingSet archiving folder! ~/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3 No .gs file found! Aborted

It must be something simple mistake I am doing with singularity container function use.

Thanks, Sithara

jacobpwagner commented 4 years ago

This is from within the container? Can I see your singularity shell statement or whatever you are using to run it?

Sithara85 commented 4 years ago

singularity shell ./gs-to-flowjo_devel.sif, is this what you asked? This takes me to the container and i have gs-to-flowjo function there.

Singularity> ./gs-to-flowjo --cytolib-version 2.1.15

jacobpwagner commented 4 years ago

But with the mount options (-B) as well. It looks like the paths within the container might be a little off so it's just not finding the GatingSet archive appropriately.

Sithara85 commented 4 years ago

I tried that as well., here I used mount option only for src directory. Our admin said since Singularity directly attached to my home directory I would need to use mount options only I get the input from a any other location than my home.

Singularity> ./gs-to-flowjo -B /panfs/roc/groups/15/thyagara/svivek/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3 --src=/panf s/roc/groups/15/thyagara/svivek/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3 --dest=/panfs/roc/groups/15/thyagara/svivek/Sin gularity/dest terminate called after throwing an instance of 'std::domain_error' what(): Not a valid GatingSet archiving folder! /panfs/roc/groups/15/thyagara/svivek/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3 No .gs file found! Aborted

jacobpwagner commented 4 years ago

The -B option isn't an argument to gs-to-flowjo as you used it there. It's an argument to singularity shell and would need to be used in starting the singularity container so that path is visible within the container. From within the container, can you see that input directory and its appropriate contents?

Singularity> ls /panfs/roc/groups/15/thyagara/svivek/Singularity/src/2017-10-13_PANEL-1_LSR_EC_Group-two_EC_F1638929_022.fcs_gsEM3/
Sithara85 commented 4 years ago

Hi Jake,

I tried to run the singularity run from terminal as you showed in the example, but getting same error.

svivek@ln0004 [/panfs/roc/scratch/svivek/HRS_EM/Helper_EM/flow] % singularity run -B /panfs/roc/scratch/svivek/HRS_EM/Helper_EM/flow/src/:/gs -B /panfs/roc/scratch/svivek/HRS_EM/Helper_EM/flow/src/:/out docker://rglab/gs-to-flowjo:devel --src=/gs --dest=/out INFO: Converting OCI blobs to SIF format INFO: Starting build... Getting image source signatures Copying blob df20fa9351a1 done Copying blob df33d3a3b5af done Copying blob f652fbb80199 done Copying blob 0bc08d6984f5 done Copying blob f1182f4df6de done Copying config f4de1d5a9e done Writing manifest to image destination Storing signatures 2020/08/06 12:44:01 info unpack layer: sha256:df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c 2020/08/06 12:44:02 info unpack layer: sha256:df33d3a3b5af66a9fd10f19b1642ca12052278dc0a3dc4b4e0a1ea5ec080c2f9 2020/08/06 12:44:02 info unpack layer: sha256:f652fbb80199552676439d601ff4844a644662b29ad2f340a0ee79103235c57b 2020/08/06 12:44:02 info unpack layer: sha256:0bc08d6984f5fe3bc3853fe8b1031b7c5f9a5a2f3f3034914af5e2623ec3fd3c 2020/08/06 12:44:02 info unpack layer: sha256:f1182f4df6de2668f9d9b4020ae48ac141f96547280573a764652680d34727fd INFO: Creating SIF file... terminate called after throwing an instance of 'std::domain_error' what(): Not a valid GatingSet archiving folder! /gs File not recognized: file:///gs/2017-10-13_PANEL-1_LSR_EC_Group-one_RR_F1631478_024.fcs_gsEM3 Aborted

Sithara85 commented 4 years ago

Let me try what you said!