lmweber / PrinciplesSTA

"Principles of Spatial Transcriptomics Analysis with Bioconductor" book
https://lmweber.org/PrinciplesSTA/
38 stars 9 forks source link

Editorial Suggestions of Chapter 9 Quality Control #34

Open boyiguo1 opened 2 years ago

boyiguo1 commented 2 years ago

Link to Chapter: https://lmweber.org/OSTA-book/quality-control.html

Suggestions:

  1. Emphasize the column cell_count in the SpatialExperiment is a product of VistoSeg instead of spaceranger count to avoid confusion between counting the cells and UMIs. For example we can expand the phrase "(which is available for this dataset)" on the line 101 https://github.com/lmweber/OSTA-book/blob/6557b14a22043be9b35b975b3f17e37e6ad485ce/chapters/09-quality_control.Rmd#L101

with

To recall,cell_count is created using VistoSeg (see [OSTA Chapter 5.6](https://lmweber.org/OSTA-book/image-segmentation-visium.html#identify-number-of-cells-per-spot)). spaceranger creates count information with the count pipeline. But it creates expression counts and similarly the library size of each spot (in the following figure sum), which will be used to create assays of a SpatialExperiment object.

  1. Provide more interpretation of the provided QC scatter plots and explain the basic assumptions:

For example we can add the following paragraphs to the paragraph https://github.com/lmweber/OSTA-book/blob/6557b14a22043be9b35b975b3f17e37e6ad485ce/chapters/09-quality_control.Rmd#L101 after the sentence "We also plot the library sizes against the number of cells per spot (which is available for this dataset). "

The blue curve describes the non-linear relationship between the library size (sum) of each spot and the number of cells (cell_count) in the corresponding spot. Ideally, the blue curve should be a monotone increasing function such that as there are more cells in each spot, the average library size of the spot grows larger. This is based on the assumption that the more cells in the spot, there are more UMIs expressed. Practically, we expect the blue curve to plateau or slightly decrease at certain values of cell_count, i.e. the number of cells per pot. However, the decrease should not be too significant.

Moreover, we could also threshold based on the number of cells per spot. We expect that the number of cells per spot should not be too large to be biologically reasonable. In other words, when the number of cells per spot exceeds a certain threshold (particularly with a small library size), it is not biologically reasonable to believe the spot can accommodate that many cells. Hence, we believe the spot is not of good quality.

The histograms on the top and the left of the graph depict the frequency of dots in the scatter plot. For example, when we interpret the histogram on top of the figure, we can interpret there are roughly 600 spots that accommodate roughly 4 cells. Anecdotally, the blank space in the histogram atop is an artifact due to inadequate break size.

  1. Some histograms describing the marginal distribution of cell_count have artifact blank space. Need to adjust the break size of the histogram. Specifically, the figure generated on lines 103-108 https://github.com/lmweber/OSTA-book/blob/6557b14a22043be9b35b975b3f17e37e6ad485ce/chapters/09-quality_control.Rmd#L103

  2. The sentence "This is to check that we are not inadvertently removing a biologically meaningful group of spots." on line 101 appears too early in the text. https://github.com/lmweber/OSTA-book/blob/6557b14a22043be9b35b975b3f17e37e6ad485ce/chapters/09-quality_control.Rmd#L101 IMO, this creates confusion that one can use plotQC to tell if the filter spots are within a biologically meaningful group. The sentence seems to direct to the spatial plot where one can display the filtered spots and see if the pattern matches with a certain biological structure, e.g. the laminar organization of a brain. Hence, I suggest to move this sentence to later in the text, perhaps later right before/after the spatial plot on Line 142 https://github.com/lmweber/OSTA-book/blob/6557b14a22043be9b35b975b3f17e37e6ad485ce/chapters/09-quality_control.Rmd#L142

lmweber commented 2 years ago

Thank you!