JinmiaoChenLab / cytofkit2

21 stars 16 forks source link

Error with ggplot #14

Open furbelows opened 5 years ago

furbelows commented 5 years ago

Hi, great program.

I am running cytofkit with pretty conventional parameters on a large set of samples. Some files have very few cells; several hundred. Most have a large number of cells.

I am getting an error when I cluster some of the files together, and can not pin down what the issue is. I think it is related to ggplot2:

Error in if (nchar(shape_string[1]) <= 1) { : missing value where TRUE/FALSE needed In addition: Warning message: Removed 233153 rows containing missing values (geom_point).

The complete output of the console is provided below.

Do you have any guidance on this issue?

Thanks!

Dy162Di<162Dy_CD45RA> -Er166Di<166Er_CD33> -Er167Di<167Er_CD28> -Er168Di<168Er_CD24> -Er170Di<170Er_CD161> -Eu151Di<151Eu_CD38> -Eu153Di<153Eu_CD11b> -Gd155Di<155Gd_CCR6> -Gd156Di<156Gd_CXCR4> -Gd157Di<157Gd_CD86> -Gd158Di<158Gd_CXCR5> -Gd160Di<160Gd_CCR7> -Ho165Di<165Ho_CD127> -In113Di<113In_CD57> -Lu175Di<175Lu_HLADR> -Nd143Di<143Nd_CD4> -Nd144Di<144Nd_CD8> -Nd148Di<148Nd_CD11c> -Pr141Di<141Pr_CD49d> -Sm147Di<147Sm_CD85j> -Sm149Di<149Sm_CD16> -Sm152Di<152Sm_CD27> -Sm154Di<154Sm_CD14> -Tb159Di<159Tb_CXCR3> -Tm169Di<169Tm_ICOS> -Yb171Di<171Yb_TCRgd> -Yb172Di<172Yb_PD-1> -Yb173Di<173Yb_CD123> -Yb174Di<174Yb_CD56> -Yb176Di<176Yb_CD25>

Extract expression data... 434805 x 49 data was extracted! Dimension reduction... Running t-SNE...with seed 42 DONE Run clustering... Running PhenoGraph...Run Rphenograph starts: -Input data of 434805 rows and 30 columns -k is set to 33 Finding nearest neighbors...DONE ~ 1704.101 s Compute jaccard coefficient between nearest-neighbor sets...DONE ~ 162.202 s Build undirected graph from the weighted links...DONE ~ 34.81 s Run louvain clustering on the graph ... Compute jaccard coefficient between nearest-neighbor sets...DONE ~ 162.202 s Build undirected graph from the weighted links...DONE ~ 34.81 s Run louvain clustering on the graph ...DONE ~ 169.733 s Run Rphenograph DONE, took a total of 2070.84599999998s. Return a community class -Modularity value: 0.8685872 -Number of clusters: 29 DONE! Progression analysis... Listing markers used for dimension reduction... Stashing sample names... Wrapping results... Analysis DONE, saving the results... R object is saved in cytofkit CD3.RData THIS R OBJECT IS THE INPUT OF SHINY APP! Error in if (nchar(shape_string[1]) <= 1) { : missing value where TRUE/FALSE needed In addition: Warning message: Removed 233153 rows containing missing values (geom_point).

lconde-ucl commented 5 years ago

Hi furbelows,

I am having the same issue, this error happens when I have more than a certain number of samples, but not if the same samples are analysed in smaller sets.

I found the problem, and is within the cytof_clusterPlot function (line 290 in https://github.com/JinmiaoChenLab/cytofkit2/blob/master/R/cytof_postProcess.R). The sampleLabel argument in that function defaults to TRUE, which makes the UMAP/tSNE plots to show labels (letters) for each individual sample using shape = sample:

cp <- ggplot(data, aes_string(x = xlab, y = ylab, colour = cluster, shape = sample)) ...

If you deactivate that option with sampleLabels=FALSE, then the plots are generated no problem:

cp <- ggplot(data, aes_string(x = xlab, y = ylab, colour = cluster)) ...

See attached two example plots to see the difference, one with the default sampleLabel=T, and another one with sampleLabel=F.

The problem here is that I'm guessing that you (like me) are not using the cytof_clusterPlot function directly, instead we are using the cytof_writeResults function, which, in turn, calls cytof_clusterPlot internally and fails because of the above issue.

So my question for the developers is: is it possible to please expose the "sampleLabel" argument from cytof_clusterPlot so that it can be passed as an argument to cytof_writeResults so that the users can set it to FALSE if necessary? Another option could be if you could perhaps modify the cytof_clusterPlot function so that sampleLabels is set to FALSE when there are more than a certain number of samples? Or do you have any other suggestion on how to deal with this issue?

Thanks in advance, Best, Lucia

with_default_sampleLabels_TRUE with_sampleLabels_FALSE
grhogg commented 4 years ago

I am struggling with the problem of not generating any .Rdata when I run large batches of FCS files (say 35) through the cytofkit2 GUI. Has there been any resolution for this problem? It works fine for me when I run smaller batches (say 10)