AlexsLemonade / alsf-scpca

Management and analysis tools for ALSF Single-cell Pediatric Cancer Atlas data.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Benchmarking of Spatial Transcriptomic libraries #149

Closed allyhawkins closed 2 years ago

allyhawkins commented 2 years ago

Closes #143. Here I have added in a notebook that explores the output of running either Alevin-fry + Spaceranger or Spaceranger only on two spatial transcriptomic libraries. For the Alevin-fry quantification, I followed the guidance in their tutorial.

There are a few things that I looked at in this notebook:

  1. The use of a SpatialExperiment object vs. a Seurat object for storing the spatial data. In general, the SpatialExperiment object is fairly versatile and has similar functionality to the SingleCellExperiment with additional storage for spatial information. I also was able to make some plots using ggspavis and generally think that for keeping consistency with how we have been working with the single-cell and single-nuclei datasets, it would be good to continue with the SpatialExperiment objects. For loading in the data into R, I followed the recommendations set forth by the OSTA Book and the SpatialExperiment vignette.
    *One thing to note is that with the SpatialExperiment plots I couldn't figure out a way to plot it with the tissue underneath. I don't know if that's a dealbreaker or not or something that we will want down the line, but doing that might take some more finagling.

  2. I also compared the difference in using Alevin-fry for quantification or Spaceranger only for quantification, looking at mito content, UMI per spot, genes detected per spot, and then the actual distribution of these values across the spots in the tissue. I noticed that although the distributions are similar, when you look at the spatial plots, they don't quite lineup as expected which is slightly concerning (i.e. the distribution of the color of the spots is not the same for each sample when looking at each of the tools). I also looked at the correlation of mean gene expression and the overlap of genes detected and generally saw high correlation (other than one line of genes shifted off the diagonal) and high overlap.

My overall conclusion that I gained after looking at these is that I wasn't sure it was worth the additional use of Alevin-fry and Spaceranger when we have to use Space ranger anyways for these samples, but let me know if reviewers came back with a different conclusion or would like to see other comparisons.

I'm attaching the html file of the rendered notebook here for reference.

allyhawkins commented 2 years ago

Thanks for the help with this @jashapiro! One of your suggestions helped with why the pattern wasn't the same between spaceranger and the Alevin-fry + spaceranger version so that looks much better now. I'm still not entirely sure why they are flipped but something about when you are integrating the Alevin-fry results appears to flip the plot. I switched the coordinate names and then also tried plotting with both x/y axis but that obviously switches both versions so innately something is different about how the data is being stored in the two experiments.

I also unloaded ggspavis and I believe I caught everything that uses that and Seurat and added the package name to the function, but let me know if I missed anything. I will also note one annoying thing about Seurat here is that to make the plot, it has a conflict with SpatialExperiment so I had to remove the SpatialExperiment package, and then make the plot before re-installing it again.

allyhawkins commented 2 years ago

@jashapiro based on our offline discussion I went ahead and made some changes to these notebooks, breaking out the exploration between using a SpatialExperiment vs. Seurat into its own notebook and the benchmarking into its own notebook. In this PR, I kept in the notebook that compares the two objects and then also added in a functions into benchmarking-functions/R to read in Alevin-fry + Spaceranger output into an spe since we will be using that down the line. I am moving the additional benchmarking into its own notebook that I will file as a separate PR that is stacked on this one.

I also added in the use of plotVisium to look at the plots with the image overlay as well for comparison, but overall, I think it still makes sense to continue to use the SpatialExperiment object rather than use Seurat here. Please let me know if there are any other comparisons or other things that I should explore for these objects that I missed.

allyhawkins commented 2 years ago

@jashapiro I added some explanation about the level of incompatibility between SpatialExperiment and Seurat to the end of the notebook. You can run the notebook to completion except for the last chunk. The warnings will be produced but won't stop things from running. The last chunk where the plot is made using Seurat will not work though unless you remove SpatialExperiment and then reinstall Seurat. I found the solution on an issue that was filed in the Seurat repo where the SpatialExperiment developers also commented and added another comment about it still existing as an issue. I also tried to add in the removal of the package and reinstallation in the actual notebook and it did not seem to like that.

jashapiro commented 2 years ago

I found the solution on an issue that was filed in the Seurat repo where the SpatialExperiment developers also commented and added another comment about it still existing as an issue.

Can you add a link to that issue to the notebook? That will be helpful for somebody who comes to look at the notebook later, in part so we might be able to see if and when it is fixed.

allyhawkins commented 2 years ago

Can you add a link to that issue to the notebook? That will be helpful for somebody who comes to look at the notebook later, in part so we might be able to see if and when it is fixed.

Yes! Went ahead and added that in.