broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
169 stars 88 forks source link

RFC: Investigate the use of query length size selection to improve sensitivity #71

Open lbeltrame opened 4 years ago

lbeltrame commented 4 years ago

(Posting here because the forked repo doesn't have issues: @gavinha, you might want to enable them)

A couple of papers suggested that selecting DNA fragments that match nucleosome size, or slightly lower than that, can greatly improve sensitivity for tumor detection in shallow WGS of cfDNA.

The selection can be done both in vitro but also in silico. The former requires specialized equipment, the latter can be done by basically anyone. I've done an initial test, very preliminary, on about 40-50 plasma samples from a study we are doing (at around 1X, a least for setting up conditions). What I did was to ensure that the query length (qlen in BAM-speak) of each aligned read was within nucleosome size (ranging from 90 to 180bp). Everything else was discarded.

This reduced the reads by about 50%, but it had a tremendous impact on sensitivity. Prior to that, even with the suggested parameters, ichorCNA was unable to find any alteration or to determine tumor fraction (mind, these samples have a ridiculously low amount of tumor DNA, 3% is when you are really lucky). After size selection, I was able to find a number of alterations that are real (cross-checked with the tissue of the same sample).

There was no time to set up a proper benchmark, but I think it warrants some discussion on whether either put this in ichorCNA itself, or suggest it (in the wiki) for those who have very low tumor fraction samples.

gavinha commented 4 years ago

Hi @lbeltrame

Sorry for the massively delayed follow up to this.

Thank you for bringing up this great idea. We are currently considering incorporating something like this. Will keep you posted...

Best, Gavin