Open mleipold opened 13 years ago
Hi Mike,
Yes. The short answer is that you start the SPADE pipeline post clustering and just run the up-sampling, plot generation, etc. There is no easy way to do so though without writing some R code. I think I might have something close lying around that I can adapt for this purpose. Give me a few days...
Good question. This is certainly possible, and in fact, this is essentially what SPADE does when it performs the "upsampling" step. Currently there is no handy button in the interface to re-analyze new data against an old tree, but it is currently possible using a series of commands in the R console (or in a script). It's been a while since I've done this myself, so when you urgently need to do it, let me know and we'll go through it together. We'll post the resulting script here on Github for all to enjoy.
On Mon, Dec 5, 2011 at 1:38 PM, mleipold < reply@reply.github.com
wrote:
Here at the Stanford HIMC, we have a few multi-year customer studies. We would get the 2009 samples, process them, then repeat each year as the 2010, 2011, 2012, etc samples come in.
Is there currently a way to build a tree with a given panel of markers starting with, say, the 2009 samples, and as each year's samples comes in, perform SPADE analysis on them using the same tree scaffold/template of the 2009 samples?
Currently, the only way I know how to make everything appear on identically-framed trees is to include all the FCS files in the same tree-building exercise. This would mean that each year, we would have to rebuild our tree. While the trees built on the same markers should be similar, they would not necessarily be identical each time the analysis is rerun.
Kinda like how in FlowJo/etc, you build a template of your analysis, and can just keep dragging new FCS files in (though in the case of SPADE, I obviously wouldn't be asking to readjust gates).
Reply to this email directly or view it on GitHub: https://github.com/nolanlab/spade/issues/18
I prepared a script for up sampling additional files in the context of previous spade runs... It is available as a gist. It is essentially the upampling, median computation and other components from SPADE.driver
extracted as a stand-alone R script.
It expects a specifically prepared directory. The example I used is:
$ tree .
.
├── 20071001-u937.002.fcs
├── output
│ ├── clusters.fcs
│ ├── clusters.table
│ ├── layout.table
│ └── mst.gml
├── runSPADE.R
└── upsample.R
You will need to modify the upsample.R
script with information from your original runSPADE.R
, e.g., clustering markers, you will also need to list out the files you want processed in the panels listing.
You will need to prepare the output
directory above, copying the files shown from the original SPADE run. These files include the clustering assignment information needed to upsample, that is assign clusters, in the new files.
With that all in place you can then run the upsample.R
script and it should upsample the new FCS files, 20071001-u937.002.fcs
in this case, compute medians, create PDFs, etc.
The result will look something like:
$ tree -L 2 .
.
├── 20071001-u937.002.fcs
├── output
│ ├── 20071001-u937.002.fcs.density.fcs.cluster.fcs
│ ├── 20071001-u937.002.fcs.density.fcs.cluster.fcs.anno.Rsave
│ ├── 20071001-u937.002.fcs.density.fcs.cluster.fcs.medians.gml
│ ├── clusters.fcs
│ ├── clusters.table
│ ├── global_boundaries.table
│ ├── layout.table
│ ├── mst.gml
│ └── pdf
├── runSPADE.R
└── upsample.R
Is this something that is going to be implemented as an "Add new FCS file to existing tree" button or dropdown option in the future?
That's not planned for the Cytoscape interface, but I think it should be added to the feature request list for the Cytobank implementation.
On Wed, Apr 4, 2012 at 1:30 PM, mleipold < reply@reply.github.com
wrote:
Is this something that is going to be implemented as an "Add new FCS file to existing tree" button or dropdown option in the future?
Reply to this email directly or view it on GitHub: https://github.com/nolanlab/spade/issues/18#issuecomment-4962318
Hi,
As I'm facing the same issue (want to analyze new data against an old tree), I tried to use the upload script generated by mleipold. I set up the directory as indicated and modified the upsample.R with the name of the new data files as well as the clustering markers. Then I run the script in R. I obtained the following error message and being new in R don't know how to fix it:
Computing medians for file: Erreur dans apply(mat, 2, tform) : dim(X) must have a positive length De plus : Message d'avis : In SPADE.markerMedians(f, igraph:::vcount(graph), cols = p$median_cols, : arcsinh_cofactor is deprecated, use transform=flowCore::arcsinhTransform(...) instead
Compute the global limits (cleaning up attribute names to match those in GML files)
attr_ranges <- t(sapply(attr_values, function(x) { quantile(x, probs=c(0.00, pctile_color, 1.00), na.rm=TRUE) })) rownames(attr_ranges) <- sapply(rownames(attrranges), function(x) { gsub("[^A-Za-z0-9]","",x) }) write.table(attr_ranges, paste(out_dir,"global_boundaries.table",sep=""), col.names=FALSE) Erreur dans file(file, ifelse(append, "a", "w")) : impossible d'ouvrir la connexion De plus : Message d'avis : In file(file, ifelse(append, "a", "w")) : impossible d'ouvrir le fichier 'output/global_boundaries.table' : No such file or directory
SPADE.plot.trees(graph,out_dir,file_pattern="_fcs_Rsave",layout=as.matrix(layout_table),out_dir=paste(out_dir,"pdf",sep=.Platform$file),size_scale_factor=NODE_SIZE_SCALE_FACTOR) Erreur dans SPADE.plot.trees(graph, out_dir, file_pattern = "_fcs_Rsave", : Not a graph object
The only thing I can say is that the 'global_boundaries.table' file is in the output directory.
Here at the Stanford HIMC, we have a few multi-year customer studies. We would get the 2009 samples, process them, then repeat each year as the 2010, 2011, 2012, etc samples come in.
Is there currently a way to build a tree with a given panel of markers starting with, say, the 2009 samples, and as each year's samples comes in, perform SPADE analysis on them using the same tree scaffold/template of the 2009 samples?
Currently, the only way I know how to make everything appear on identically-framed trees is to include all the FCS files in the same tree-building exercise. This would mean that each year, we would have to rebuild our tree. While the trees built on the same markers should be similar, they would not necessarily be identical each time the analysis is rerun.
Kinda like how in FlowJo/etc, you build a template of your analysis, and can just keep dragging new FCS files in (though in the case of SPADE, I obviously wouldn't be asking to readjust gates).