DillonHammill / CytoExploreR

Interactive Cytometry Data Analysis
61 stars 13 forks source link

cyto_load behavior - Windows 10 #48

Closed rwbaer closed 4 years ago

rwbaer commented 4 years ago

Briefly describe what you hope to achieve: I am trying to load a series of .fcs files located in a network folder that may or may not contain only .fcs files. The plan would be to use cyto_save to put them in a subfolder of the current RStudio project.

Using method one, I seem to load only a single file in my list. Using method 2, I get an error purportedly related to path normalization. I am using the forward slash standard for specifying paths which avoids having to double backslash escape Windows 10 paths. The code at error time has turned my path back into Windows backslashes.

Outline the steps taken to attempt to reach this goal (paste code below): The meta-reproducible error is shown in the code fragment below: (note that I also tried a version where d.comp has no terminating slash, but both forms work with R file.path() )

> d.comp = "S:/1-Thesis/7-Thesis Datasets and Code/Flow Cytometry/Analysis - Legendplex Assays/2020-02-05_Legendplex03_CCL4MVC/calibration/"
> f = list.files(path = d.comp, pattern = ".fcs")
> f
**[1] "PE_Beads.fcs"            "PE_Beads_Legendplex.fcs" "PerCP_Spill.fcs"        
[4] "raw beads.fcs"**          
> Compensation = cyto_load(path = d.comp, select = f)
> Compensation
A cytoset with **1 samples.**

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT

> Compensation2 = cyto_load(path = d.comp, pattern = ".fcs") 
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  can't open the file: S:\1-Thesis\7-Thesis Datasets and Code\Flow Cytometry\Analysis - Legendplex Assays\2020-02-05_Legendplex03_CCL4MVC\calibration\Protocol
Please check if the path is normalized to be recognized by c++!

Include any associated screenshots or images here:

rwbaer commented 4 years ago

This may be a bug based on:

> Compensationn3 = load_cytoset_from_fcs(path = d.comp, pattern = ".fcs")
> Compensationn3
A cytoset with 4 samples.

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT

> Compensationn3[[3]]
flowFrame object 'PerCP_Spill.fcs'
with 7133 cells and 21 observables:
           name             desc      range minRange   maxRange
$P1    TIME_MSW            TIME- 4294967296        0 4294967296
$P2    TIME_LSW            TIME- 4294967296        0 4294967296
$P3  FSC-HEIGHT       FSC-HEIGHT    1048575        0    1048575
$P4    FSC-AREA         FSC-AREA    1048575        0    1048575
$P5   FSC-WIDTH        FSC-WIDTH    1048575        0    1048575
$P6  SSC-HEIGHT       SSC-HEIGHT    1048575        0    1048575
$P7    SSC-AREA         SSC-AREA    1048575        0    1048575
$P8   SSC-WIDTH        SSC-WIDTH    1048575        0    1048575
$P9  FL1-HEIGHT       FL1-HEIGHT    1048575        0    1048575
$P10   FL1-AREA         FL1-AREA    1048575        0    1048575
$P11  FL1-WIDTH        FL1-WIDTH    1048575        0    1048575
$P12 FL2-HEIGHT  FL2-FITC-HEIGHT    1048575        0    1048575
$P13   FL2-AREA    FL2-FITC-AREA    1048575        0    1048575
$P14  FL2-WIDTH   FL2-FITC-WIDTH    1048575        0    1048575
$P15 FL3-HEIGHT    FL3-PE-HEIGHT    1048575        0    1048575
$P16   FL3-AREA      FL3-PE-AREA    1048575        0    1048575
$P17  FL3-WIDTH     FL3-PE-WIDTH    1048575        0    1048575
$P18 FL4-HEIGHT FL4-PerCP-HEIGHT    1048575        0    1048575
$P19   FL4-AREA   FL4-PerCP-AREA    1048575        0    1048575
$P20  FL4-WIDTH  FL4-PerCP-WIDTH    1048575        0    1048575
$P21       SORT             SORT 4294967296        0 4294967296
195 keywords are stored in the 'description' slot
rwbaer commented 4 years ago

Since you may not be a windows user and may not know this, Windows works just fine with paths that use the forward slash. When you are writing cross-platform this greatly simplifiess things.

IF, you choose the backslash character in R, you need to escape it. Why my paths revert to the Windows backslash standard here, I don't understand. Even worse, the single backslash will fail (at least in R; don't know about C++).

Surprised the error message is not escaped and written: S:\\1-Thesis\\7-Thesis Datasets and Code\\Flow Cytometry\\Analysis - Legendplex Assays\\2020-02-05_Legendplex03_CCL4MVC\\calibration\Protocol

if it is using backslashes at all. Of course, to make this message look right I had to double escape and maybe that figures in. All in all if you can avoid using the backslash life gets easier.

I'm not sure normalizePath() is necessary even for C++, but if it is, try using winslash = "/"

DillonHammill commented 4 years ago

I will look into this when I get time. I wrote these functions with the expectation that the files would be located in folders in the current working directory and not some foreign location.

It will most likely be fixed by normalizePath().

DillonHammill commented 4 years ago

@rwbaer this should now be fixed. Please pull down the latest version of CytoExploreR and try again:

devtools::install_github("DillonHammill/CytoExploreR")
rwbaer commented 4 years ago

Unfortunately, updating did not seem to improve things. I talked a lot about path construction above which may have distracted you from noticing the first way I tried to use the cyto_load() function

> # Code below is for Issue # 48 report
> d.comp = "S:/1-Thesis/7-Thesis Datasets and Code/Flow Cytometry/Analysis - Legendplex Assays/2020-02-05_Legendplex03_CCL4MVC/calibration/"
> f = list.files(path = d.comp, pattern = ".fcs")
> f
[1] "PE_Beads.fcs"            "PE_Beads_Legendplex.fcs" "PerCP_Spill.fcs"         "raw beads.fcs"          
> Compensation = cyto_load(path = d.comp, select = f)
> Compensation
A cytoset with 1 samples.

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT

Note that there are 4 files in the directory, but cyto_load() only loads the first one.

Repeating the second way I tried to use it, the result seems just the same.

rwbaer commented 4 years ago

I have looked more carefully at the output of the second method:

> d.comp = "S:/1-Thesis/7-Thesis Datasets and Code/Flow Cytometry/Analysis - Legendplex Assays/2020-02-05_Legendplex03_CCL4MVC/calibration/"
> Compensation2 = cyto_load(path = d.comp, pattern = ".fcs") 
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  can't open the file: S:\1-Thesis\7-Thesis Datasets and Code\Flow Cytometry\Analysis - Legendplex Assays\2020-02-05_Legendplex03_CCL4MVC\calibration\Protocol
Please check if the path is normalized to be recognized by c++!
> Compensation2
Error: object 'Compensation2' not found

The hint about path normalization is a 'red herring'. The problem is that cyto_load is trying to access a subdirectory included in the path as if it is one of the files that meets the pattern ".fcs" It is the existence of the "./Protocol" subfolder that seems to be triggering the error even though this does not match the pattern specified.

SpillFolder

DillonHammill commented 4 years ago

@rwbaer, the select and exclude arguments were originally written to accept file extensions only. For example, this works:

gs <- cyto_setup("Samples",
                 select = "fcs")

I have updated the handling of the select and exclude arguments to appropriately handle file names as well. Try the latest CytoExploreR and let me know how you go.

devtools::install_github("DillonHammill/CytoExploreR")
rwbaer commented 4 years ago

Hmmm... i was taking it that select and exclude were looking for files that met or excluded a match to a regular expression. Anyway, here is the behavior with your syntax if the folder in question contains a subfolder

> # Load required packages
> library(CytoExploreR)
> # Load required packages
> library(CytoExploreR)
> rm(Compensation)
Warning message:
In rm(Compensation) : object 'Compensation' not found
> d.comp = "C:/Users/rbaer/Documents/R_Projects/LP3/Compensation-Samples/"
> f.all = list.files(path = d.comp)
> f.all
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs" "New folder"                         "ReadMe.txt"                        
> f = list.files(path = d.comp, pattern = ".fcs")
> f
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> Compensation = cyto_load(path = d.comp, select = f)
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  can't open the file: C:\Users\rbaer\Documents\R_Projects\LP3\Compensation-Samples\New folder
Please check if the path is normalized to be recognized by c++!
> Compensation
A flowSet with 7 experiments.

  column names:
  FSC-A FSC-H FSC-W SSC-A SSC-H SSC-W Alexa Fluor 488-A PE-A PE-Texas Red-A 7-AAD-A PE-Cy7-A Alexa Fluor 405-A Alexa Fluor 430-A Qdot 605-A Alexa Fluor 647-A Alexa Fluor 700-A APC-Cy7-A Time

Restarting R session...

> # Code below is for Issue # 48 report
> # Make sure you are using the latest code
> devtools::install_github("DillonHammill/CytoExploreR")
Skipping install of 'CytoExploreR' from a github remote, the SHA1 (29549259) has not changed since last install.
  Use `force = TRUE` to force installation
> # Load required packages
> library(CytoExploreR)
Loading required package: flowCore
Loading required package: flowWorkspace
As part of improvements to flowWorkspace, some behavior of
GatingSet objects has changed. For details, please read the section
titled "The cytoframe and cytoset classes" in the package vignette:

  vignette("flowWorkspace-Introduction", "flowWorkspace")
Loading required package: openCyto
> rm(Compensation)
Warning message:
In rm(Compensation) : object 'Compensation' not found
> d.comp = "C:/Users/rbaer/Documents/R_Projects/LP3/Compensation-Samples/"
> f.all = list.files(path = d.comp)
> f.all
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs" "New folder"                         "ReadMe.txt"                        
> f = list.files(path = d.comp, pattern = ".fcs")
> f
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> Compensation = cyto_load(path = d.comp, select = f)
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  can't open the file: C:\Users\rbaer\Documents\R_Projects\LP3\Compensation-Samples\New folder
Please check if the path is normalized to be recognized by c++!
> Compensation
Error: object 'Compensation' not found
> comp = cyto_load(path = d.comp, select = "fcs")
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  can't open the file: C:\Users\rbaer\Documents\R_Projects\LP3\Compensation-Samples\New folder
Please check if the path is normalized to be recognized by c++!
> comp
Error: object 'comp' not found

If the subfolder is removed, but a Readme.txt file is left behind:

> devtools::install_github("DillonHammill/CytoExploreR")
Skipping install of 'CytoExploreR' from a github remote, the SHA1 (29549259) has not changed since last install.
  Use `force = TRUE` to force installation
> 
> # Load required packages
> library(CytoExploreR)
> 
> rm(Compensation)
Warning message:
In rm(Compensation) : object 'Compensation' not found
> d.comp = "C:/Users/rbaer/Documents/R_Projects/LP3/Compensation-Samples/"
> f.all = list.files(path = d.comp)
> f.all
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs" "ReadMe.txt"                        
> f = list.files(path = d.comp, pattern = ".fcs")
> f
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> Compensation = cyto_load(path = d.comp, select = f)
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  This does not seem to be a valid FCS2.0, FCS3.0 or FCS3.1 file
> Compensation
Error: object 'Compensation' not found
> 
> comp = cyto_load(path = d.comp, select = "fcs")
Error in fcs_to_cytoset(sapply(files, normalizePath), list(which.lines = which.lines,  : 
  This does not seem to be a valid FCS2.0, FCS3.0 or FCS3.1 file
> comp
Error: object 'comp' not found

If both subfolder and readme.txt are removed, it works as expected:


Restarting R session...

> # Code below is for Issue # 48 report
> # Make sure you are using the latest code
> devtools::install_github("DillonHammill/CytoExploreR")
Skipping install of 'CytoExploreR' from a github remote, the SHA1 (29549259) has not changed since last install.
  Use `force = TRUE` to force installation
> 
> # Load required packages
> library(CytoExploreR)
Loading required package: flowCore
Loading required package: flowWorkspace
As part of improvements to flowWorkspace, some behavior of
GatingSet objects has changed. For details, please read the section
titled "The cytoframe and cytoset classes" in the package vignette:

  vignette("flowWorkspace-Introduction", "flowWorkspace")
Loading required package: openCyto
> 
> rm(Compensation)
Warning message:
In rm(Compensation) : object 'Compensation' not found
> d.comp = "C:/Users/rbaer/Documents/R_Projects/LP3/Compensation-Samples/"
> f.all = list.files(path = d.comp)
> f.all
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> f = list.files(path = d.comp, pattern = ".fcs")
> f
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> Compensation = cyto_load(path = d.comp, select = f)
> Compensation
A cytoset with 2 samples.

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT

My understanding was the cyto_load was a wrapper for load_cytoset_from_fcs().

This latter works as expected with sub-folders and other files present:

> library(CytoExploreR)
> 
> rm(Compensation)
Warning message:
In rm(Compensation) : object 'Compensation' not found
> d.comp = "C:/Users/rbaer/Documents/R_Projects/LP3/Compensation-Samples/"
> f.all = list.files(path = d.comp)
> f.all
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs" "New folder"                         "Readme.txt"                        
> f = list.files(path = d.comp, pattern = ".fcs")
> f
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> 
> Compensation3 = load_cytoset_from_fcs(path = d.comp, pattern = ".fcs")
> Compensation3
A cytoset with 2 samples.

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT
DillonHammill commented 4 years ago

Sorry there was a typo that I missed, should work now hopefully.

rwbaer commented 4 years ago

Looking like we have a winner ...

> # Load required packages
> library(CytoExploreR)
> 
> rm(Compensation)
Warning message:
In rm(Compensation) : object 'Compensation' not found
> d.comp = "C:/Users/rbaer/Documents/R_Projects/LP3/Compensation-Samples/"
> f.all = list.files(path = d.comp)
> f.all
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs" "New folder"                         "PE_Beads_Legendplex.fcs"           
[5] "raw beads.fcs"                      "Readme.txt"                        
> f = list.files(path = d.comp, pattern = "Spill")
> f
[1] "20200205_FL3_PE_Beads-Spill.fcs"    "20200205_FL4_PerCP_Beads-Spill.fcs"
> Compensation = cyto_load(path = d.comp, select = f)
> Compensation
A cytoset with 2 samples.

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT

> 
> comp = cyto_load(path = d.comp, select = "fcs")
> comp
A cytoset with 4 samples.

  column names:
    TIME_MSW, TIME_LSW, FSC-HEIGHT, FSC-AREA, FSC-WIDTH, SSC-HEIGHT, SSC-AREA, SSC-WIDTH, FL1-HEIGHT, FL1-AREA, FL1-WIDTH, FL2-HEIGHT, FL2-AREA, FL2-WIDTH, FL3-HEIGHT, FL3-AREA, FL3-WIDTH, FL4-HEIGHT, FL4-AREA, FL4-WIDTH, SORT
DillonHammill commented 4 years ago

Excellent! Glad to see that it works now!