ChristofferFlensburg / superFreq

Analysis pipeline for cancer sequencing data
MIT License
110 stars 33 forks source link

New variants excel file #83

Closed UmairAhmadKhan97 closed 3 years ago

UmairAhmadKhan97 commented 3 years ago

Just a quick question here. I want to extract the new variants for my samples from the excel files. It seems that there are different "sections" of the spreadsheet labeled by the time point from say germline to progression. Is there any difference from the to.Samp to to.Samp1 or is it all part of the time point. Just wanted to double check before parsing/reading anything.

ChristofferFlensburg commented 3 years ago

Hey,

That output hasn't really been updated in years, so it's probably not the place to look for things. The purpose of it was to do pairwise comaprison between samples and identify variants that present in one sample but not the other. So tab "sample1.to.sample2" attempts to list variants present in sample 2 that did not exist (or at very low level) in sample 1. However that output hasn't been maintained and I'm not sure if it's reliable. Maybe I should remove it actually, because I think it's rare that it's the best place to look for things.

I'd suggest using the somaticVariants.xls (identifying somatic variants with respect to the matched normal) or clonal tracking (that tracks variants by behaviour across all samples) in the river output instead. Looking at the scatter plots can be informative as well.

Good luck.

UmairAhmadKhan97 commented 3 years ago

I see, thanks for letting me know!

Best, Umair