davismcc / scaterPaperExtras

Discussion board for modifications to the scater paper
0 stars 0 forks source link

Page 3: SCESet: Paragraph 1 #7

Closed LTLA closed 8 years ago

LTLA commented 8 years ago

ExpressionSet has proven a cornerstone of microarray and bulk RNA-seq analysis methods in Bioconductor, but extensions to it are desirable to add capabilities for scRNA-seq analyses.

Bit too flowery for my taste. Perhaps:

While the ExpressionSet class is the basis of many microarray and bulk RNA-seq anaysis methods in Bioconductor, extensions to the class design are necessary to support scRNA-seq data analyses.


Specifically, the SCESet class adds slots for: a reduced-dimension representation of cells, cell-cell and gene-gene pairwise distance matrices, bootstrapped expression results (such as from kallisto), consensus clustering results, information about feature controls (such as ERCC spike-ins) and several more (Supplementary Figure 1).

Some explanation of why these slots are necessary would be useful:

Specifically, the SCESet class adds slots to store a reduced-dimension representation of the expression profiles, to easily visualize the relationships between cells; cell-cell and gene-gene pairwise distance matrices, for clustering or regulatory network reconstruction; bootstrapped expression results (such as from kallisto), to gauge the accuracy of expression quantification; consensus clustering results, where cluster assignments for each cell are combined from different methods to improve reliability; information about feature controls (such as ERCC spike-ins), which is required in downstream steps such as normalization, QC and detection of highly variable genes; and several more (Supplementary Figure 1).


With these extra slots, SCESet objects can support analyses of scRNA-seq data in scater and in packages that build on scater that ExpressionSet could not.

Just:

With these extra slots, SCESet objects can support analyses of scRNA-seq data that ExpressionSet cannot.


Move the last section of the following paragraph to the end of this paragraph, as talking about different data types is relevant to SCESet's storage capabilities, not to various filtering and subsetting details. Suggested:

In addition, extra data types such as FACS marker expression or epigenetic information can be easily stored in each SCESet object for integration with the single-cell expression profiles.

davismcc commented 8 years ago

Ah, you make it hard to write with panache. But fair enough. I guess I should save that for a novel.

These changes applied.