AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
99 stars 66 forks source link

Revision: Add metadata about co-extracted vs single-extraction samples #1629

Closed sjspielman closed 1 year ago

sjspielman commented 1 year ago

To enable tumor purity thresholding using the WGS-derived tumor_fraction metadata, we should only consider samples whose RNA and DNA were co-extracted in a single experiment. We do not want to consider "single extractions" where RNA was extracted separately from DNA, as those numbers by definition will not apply to RNA-level information.

From the Excel spreadsheet received from sequencing centers, we want to identify samples where RNA and DNA are from the same extraction. These samples can be used to filter down samples to a given tumor purity threshold for relevant analyses.

We can either include this information in a stand-alone TSV file in the pending (?) tumor-purity-exploration module #1622, or we can directly add this co-extraction information into the overall pbta-histologies(-base).tsv metadata file which will facilitate filtering performed across several analysis modules (CC @jaclyn-taroni for this thought!).

jaclyn-taroni commented 1 year ago

We can either include this information in a stand-alone TSV file in the pending (?) tumor-purity-exploration module #1622, or we can directly add this co-extraction information into the overall pbta-histologies(-base).tsv metadata file which will facilitate filtering performed across several analysis modules.

I was wondering if we should track the table somewhere in this repository and then add it to pbta-histologies-base.tsv to be included in pbta-histologies.tsv in the release? So we'd tack it onto subtyping, essentially.

jaclyn-taroni commented 1 year ago

I believe this was closed with the release of v23