JinmiaoChenLab / cytofkit

cytofkit: an integrated flow/mass cytometry data analysis pipeline
http://jinmiaochenlab.github.io/cytofkit/
57 stars 24 forks source link

Error-tSNE continues to be performed even when de-selected in GUI #59

Open mleipold opened 5 years ago

mleipold commented 5 years ago

Hi,

I'm trying to do a huge analysis (FlowSOM on >400 files without downsampling), and I had issues with tSNE crashing the R session after days of trying.

So, I restarted my analysis with the tSNE button "unchecked". See screenshot. 011919-cytofkit run-tsne issue

However, when I hit Submit to start the job, RStudio still gave me this:

So, tSNE is still being performed as a dimensionality reduction.

I didn't think that Rphenograph or FlowSOM require tSNE for clustering (unlike, say, ACCENSE).

Is this tSNE process required in some fashion?

Mike

SamGG commented 5 years ago

Hi Mike, The code is forcing the dimension reduction method to be tsne. There seems to be no infotmation from the GUI that is used to fill parameter. https://github.com/JinmiaoChenLab/cytofkit/blob/c4f93e5d849cf670c3825e5abacd850822c8eca7/R/cytofkit_GUI.R#L531-L533 The cytofkit function allows 3 dimension reduction methods (L136). This dimension reduction is independent from the visualisation (L138). https://github.com/JinmiaoChenLab/cytofkit/blob/c4f93e5d849cf670c3825e5abacd850822c8eca7/R/cytofkit.R#L136-L138 The dimension reduction function itself allows more methods but I stick on the main 3. https://github.com/JinmiaoChenLab/cytofkit/blob/c4f93e5d849cf670c3825e5abacd850822c8eca7/R/cytof_dimensionReduction.R#L34

So, because you do so much on cytoforum, here is for you (PS: it's very late and I didn't check if it makes sense). Run

library(flowCore)
library(cytofkit)
# open and source cytofkit_GUI_mod.R
cytofkit_GUI()

A line for dimension reduction has been added and initialized with tSNE, but change it to PCA and deselect tsne. A very short trial and the modified code are following. Download the text file containing the code from the link at the end of this post and rename it as cytofkit_GUI_mod.R. Hope it will work for you.

Notice the Dim. Red. parameter.

gui

While running PCA is stated as dimension reduction method.

run

The results in the Shiny app shows the PCA.

results

The code is in the next link cytofkit_GUI_mod.R.txt

mleipold commented 5 years ago

Hi Samuel,

Thanks for the response.

I downloaded the cytofkit_GUI_mod.R.txt, changed it to cytofkit_GUI_mod.R (no .txt), placed it in the cytofkit R folder, and ran the GUI. I did not get the "Dim. red. method(s)" to show.

I thought maybe the original cytofkit_GUI.R was interfering, so I removed it from the folder and reran the GUI. Still no "Dim.red".

I noticed that the removal of the "txt" in a Mac environment did not give the little "R" type icon. So, I copied the text into an RStudio pane, and saved it cytofkit_GUI.R. This time I got the "R" type icon, so I reran the GUI. Still no "Dim.red" in the GUI.

Is there something I'm leaving out?

BTW, I'm in Mac High Sierra 10.13.4, RStudio 1.1.442, R 3.4.4

Mike

SamGG commented 5 years ago

Hi Mike, Sound over complicated. I didn't explain clearly. I guess you are using RStudio. Load cytofkit library as usual. Before running cytofkit_GUI(), open the cytofkit_GUI_mod.R file with RStudio, then click on Source (Ctrl+Shift+S). This loads the function. Then run the GUI. Let me know.

mleipold commented 5 years ago

Hi Samuel,

I downloaded the "cytofkit_GUI_mod.R.txt" again just to have a clean copy. I removed the ".txt", then loaded it into RStudio as you described.


library("cytofkit", lib.loc="/Library/Frameworks/R.framework/Versions/3.4/Resources/library") Loading required package: ggplot2 Loading required package: plyr source('~/Downloads/JinmiaoChenLab-cytofkit-c4f93e5/R/cytofkit_GUI_mod.R') library(flowCore)

Attaching package: ‘flowCore’

The following object is masked from ‘package:base’:

sort

cytofkit_GUI() Error in tclVar(cur_dir) : could not find function "tclVar" source('~/Desktop/cytofkit_GUI_mod.R') cytofkit_GUI() Error in tclVar(cur_dir) : could not find function "tclVar"

I haven't seen this "tclVar" error before.

SamGG commented 5 years ago

Hi Mike, Sounds like you need to load tcltk library explicitly: library(tcltk) This library should have already been loaded transparently. If you get another error at that stage, try to run the standard cytofkit_GUI using cytofkit::cytofkit_GUI() in order to load required packages, quit the GUI and rerun the modified GUI version cytofkit_GUI() . Let me know.

mleipold commented 5 years ago

hi Samuel,

That fixed it. I now get the GUI, and see the Dim.Red. section.

Thanks for your help! Mike

mleipold commented 5 years ago

I tried running the above overnight, and got an error message that killed the operation (note: Traceback info at the very bottom): right after Rphenograph started, this message appeared: "Error in nn2(data, data, k, treetype = "kd", searchtype = "standard") : long vectors (argument 11) are not supported in .C "

Extract expression data... 48376032 x 50 data was extracted! Dimension reduction... Running PCA... DONE Run clustering... Running PhenoGraph...Run Rphenograph starts: -Input data of 48376032 rows and 26 columns -k is set to 100 Finding nearest neighbors... Hide Traceback

Rerun with Debug Error in nn2(data, data, k, treetype = "kd", searchtype = "standard") : long vectors (argument 11) are not supported in .C 9. nn2(data, data, k, treetype = "kd", searchtype = "standard") 8. find_neighbors(data, k = k + 1) 7. system.time(neighborMatrix <- find_neighbors(data, k = k + 1)[, -1]) 6. Rphenograph(xdata, k = Rphenograph_k) 5. membership(Rphenograph(xdata, k = Rphenograph_k)) 4. FUN(X[[i]], ...) 3. lapply(clusterMethods, cytof_cluster, ydata = allDimReducedList[[dimReductionMethod]], xdata = exprs_data[, markers], Rphenograph_k = Rphenograph_k, FlowSOM_k = FlowSOM_k, flowSeed = seed) 2. cytofkit(fcsFiles = inputs[["fcsFiles"]], markers = inputs[["markers"]], projectName = inputs[["projectName"]], mergeMethod = inputs[["mergeMethod"]], fixedNum = inputs[["fixedNum"]], transformMethod = inputs[["transformMethod"]], dimReductionMethod = inputs[["dimReductionMethod"]], clusterMethods = inputs[["clusterMethods"]], ... at cytofkit_GUI_mod.R#603 1. cytofkit_GUI()

I tried rerunning it this morning, and got the same error.

SamGG commented 5 years ago

Hi Mike, I think that's a problem of the Phenograph implementation. It's part of the logic of cytofkit and out of the scope of my ability. The nearest neighbor algorithm sounds to have problem in facing 48 10e6 points. I am not sure that it is possible in a reasonable time either. This part could be paralleled, but it was not. I would suggest you to run cytofkit without Phenograph, at least for testing. flowSOM might be OK on that huge dataset. flowSOM (using a custom script) allows one to buid the grid (aka clusterize) using a reduced set of representatives FCS. Then it's possible to map many FCS on that grid. Because representative FCS have been used to define the grid, rare populations that could be found in a unique FCS should be found also with flowSOM. Best.