Closed gdreiman1 closed 7 years ago
That's the correct file, and I think that's a good change to try!
Hi,
Thanks for pointing the article. In fact you did find the right function and code. The name of the function is stated at the 2nd line of the cluster.R file.
https://github.com/nolanlab/spade/blob/master/R/cluster.R#L2
Your proposal of changing >
to >=
is correct. Because of drop=FALSE
, a single observation will be kept as a matrix of one row, and colMeans
will work.
If you are used to work with package under RStudio (or wish to do it), I think you should try this official way. The quick and dirty way consists in modifying directly the code of the cluster.R file in your installed SPADE library. You could find the right place/directory using path.package("spade")
(after you installed the SPADE package).
I am not a fan of using SPADE for small single cell RNA-seq data analysis. See comment https://github.com/nolanlab/spade/issues/122#issuecomment-198071225, https://github.com/nolanlab/spade/issues/128#issuecomment-237693735 for alternatives.
HTH
Thanks for the quick replies! @SamGG Can you expand on the "quick and dirty way"? The issue I'm having is that I can't find any cluster.R in my SPADE library. I'm not sure why this is. Is there a way to download the cluster.R file from github to use in my library?
Ooops, sorry, I am completely wrong, the quick and dirty does not work. So, the proper way consists in building the new package:
Happy coding, HTH
Got it working! Thanks for such a thorough explanation, I really appreciate it.
Although it is out of scope of the SPADE, have a look at CIDR http://www.biorxiv.org/content/early/2016/08/10/068775 https://github.com/VCCRI/CIDR that challenges state-of-the-art methods, including tSNE.
In the 2016 Nature Protocols paper, the troubleshooting section explains that SPADE drops any cluster containing a single cell from the graph:
I've been trying to find and modify this function so that I can run SPADE on a small (~100 cell) dataset. However, I can't seem to find a function named SPADE.cluster in the package I downloaded. I did find a file named cluster.R (on the github page) that seems to have a section checking cluster size in lines 15-28:
Is this the code that I need to modify? And if so how would I modify it and then incorporate the modification into my downloaded SPADE package?
My initial instinct is to change
if (length(obs) > 1)
toif (length(obs) >= 1)
, but I'm not sure if colMeans() will work ifobs
has a single entry.