greenelab / miQC

Flexible, probablistic metrics for quality control of scRNA-seq data
BSD 3-Clause "New" or "Revised" License
18 stars 1 forks source link

Which data to use from Cellranger #3

Closed tjbencomo closed 3 years ago

tjbencomo commented 3 years ago

Hi,

Very cool software! In the Biorxiv manuscript, you write

We also caution against using miQC on data that has already been filtered by some prior preprocessing step, and recommend users of miQC be aware of any filtering that has been done on their data, especially in the case of public datasets.

Cellranger produces two output directories. An unfiltered directory (raw_feature_bc_matrix) and a filtered directory (filtered_feature_bc_matrix). I believe the filtered directory removes empty cells but does not filter for damaged/poor quality cells.

Which directory would you recommend using as input to miQC?

arielah commented 3 years ago

Thank you for bringing this up @tjbencomo! For our Cellranger analyses, we used the data from the filtered_feature_bc_matrix directory. You're right that Cellranger's filtering step removes empty cells but not poor quality cells, which is where miQC comes in. We'll make this clearer in the manuscript as well: "We caution against using miQC on data that has already been filtered based on mitochondrial fraction."

tjbencomo commented 3 years ago

Thanks for the clarification!