Open seanam opened 8 years ago
Hi, As a simple user from time to time I will give you a quick feedback and others may answer more in detail. I think that it will be easier for us to understand if you tell us what your data are. In FCM, a sample is usually contained FCS and consists in many events recorded in many dimensions, the fluorescent markers. That a sample is a matrix with columns being markers aka dimensions, and row are events. Depending on the population they belong to, events aka cells could be numerous or rare. That is why the down-sampling is important, especially because SPADE's one aims to lead to homogeneous sampled populations. Being over-simplistic, the remaining process is based on clustering.
HTH.
Hi Sean,
downsampling_target_percent=1.0
, but I actually haven't tried this. For technical reasons you might need to try 0.99 or something slightly less than 1.0. (Internally some of our lab members have clustered data without downsampling, and the biggest issue is usually availability of system resources/time, but otherwise it can look nice.)CLUSTERING_SAMPLES
is the number of events that will be randomly selected after the density-dependent downsampling and is there to avoid swamping your computer. It can be greater than the number of rows you start with. TARGET_CLUSTERS
is the target number of nodes.SPADE.driver
is the main entry point and would be the code to work from if you want to try this. Otherwise I'd recommend exporting your dataframes to FCS format first.Might want to try csvtofcs which should be able to convert a dataframe into fcs.
Thanks for your replies! Your input has been very helpful. @SamGG: my dataset consists of fluorescent measurements acquired via microscopy. There are approximately 10 markers (dimensions) and 100 events (rows) so its a pretty small dataset for now.
@zbjornson:
downsampling_target_percent=1.0
but got the following error: Error in if (nrow(tbl) > 60000) { : argument is of length zero
Not that important since I can use 0.99 but seems like setting it to 1.0 doesn't work.
I am getting this error now:
Producing tables... Error in rownames(pivot) : object 'pivot' not found
Any idea what is happening?
@dm319: Thanks for the tip! This works better than the other conversion method I was using.
IMHO, you should better do a simple hierarchical clustering, multi-dimensional scale or tSNE than trying to fit your data into SPADE, especially if you don't apply down-sampling. Best.
Re: object 'pivot' not found
, that's from https://github.com/nolanlab/spade/blob/master/R/driver.R#L256 and it looks like that would happen if there were no .anno.Rsave files produced (not sure why that would happen). Try commenting out lines 256 through 265; it will remove just one of the three transpositions of the statistics tables.
Re: @SamGG's comment -- without downsampling, SPADE's clustering is plain hierarchical clustering (followed by the MST and layout calculations). There's no harm in using SPADE for that, but it might be simpler to use the underlying clustering module (Rclusterpp) directly. Rclusterpp, in turn, is a faster replacement for the built-in hclust
function.
Hi,
I'm interested in using SPADE for non-cytometry data and have followed the recommendations in the FAQ. I have a few questions about the process:
CLUSTERING_SAMPLES
represent and how does it relate toTARGET_CLUSTERS
?Thanks, Sean