hemberg-lab / SC3

A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
http://bioconductor.org/packages/SC3
GNU General Public License v3.0
118 stars 55 forks source link

SC3 repeatedly loads packages while running and has data display issues #76

Closed jamesrhowe closed 5 years ago

jamesrhowe commented 5 years ago

I am trying to use your algorithm to do clustering, but I am getting a few weird issues. First, the package keeps trying to load SC3 and its dependencies while running (with particular note made of pkgmaker), which it does between every step after K-estimation, and at least 3 times before proceeding to the next step. It also keeps warning that the isFALSE function is masked from the base implementation each time. It does this for both the normal and hybrid SVM implementation, and it happens whether or not other libraries are loaded into the workspace. I do not know if it matters, but I have 5100 cells in my dataset.

I have included a screenshot below:

screen shot 2018-08-14 at 3 25 18 pm

I do not know if this issue results from the one above, but whenever I then try to use the sc3_plot_consensus function, it fails and returns a rendering error. SC3 creates the correct object in general, returning output$sc3_n_clusters but the SCESet object has issues displaying when I try to View() it after performing its non-SVM implementation.

I have included screenshots of this issue as well:

normal:

screen shot 2018-08-14 at 3 31 23 pm

hybrid:

screen shot 2018-08-14 at 4 08 33 pm

Lastly, when I run it in SVM mode it returns a bunch with NA values, while it does not do this for the non-SVM mode.

normal:

screen shot 2018-08-14 at 4 11 05 pm

hybrid:

screen shot 2018-08-14 at 4 13 00 pm

I am sorry to present so many questions about both modes... I have just spent the last few days troubleshooting and have made very little progress, despite greatly wanting to use this method!

Thanks a bunch!

mhemberg commented 5 years ago

Hi James,

and thanks for your questions. Vladimir who is the real expert on this is currently away on vacation, so it will take some time before he can give you a more informed answer than mine. In the meantime, here are my thoughts:

1) On the plotting issue (your second screenshot). It looks as if there is an error indicating that you are running out of memory. Have you tried to increase the memory available for R or run on a machine with more memory so that this message does not appear?

2) For the SVM NA issue, is this with the dataset containing 5100 cells, or is it a larger dataset?

Best Regards,

Martin

jamesrhowe commented 5 years ago
  1. All right, I will try it on a cluster and let you know how it goes.

  2. The NAs were produced from the dataset containing 5100 cells, which is what I had used for all of my analyses. I used the same dataset for use with and without SVM so I could see how each one performed on my data to figure out if they yielded similar results or if the additional speed came at the cost of accuracy.

wikiselev commented 5 years ago

Hi James,

  1. The log messages about loading SC3 are fine.
  2. You won't be able to plot consensus of 5100 cells, too much memory required. I would recommend this plotting only for datasets <1000 cells.
  3. Read about the SVM and NA values in the SC3 vignette: https://bioconductor.org/packages/release/bioc/vignettes/SC3/inst/doc/SC3.html#hybrid-svm-approach

Hope this helps, Cheers, Vlad