bioinform / somaticseq

An ensemble approach to accurately detect somatic mutations using SomaticSeq
http://bioinform.github.io/somaticseq/
BSD 2-Clause "Simplified" License
194 stars 53 forks source link

Could you provide documentation on the 70 features extracted by somaticseq? Trying to better understand SomaticSeq's model #48

Closed kiranchari closed 6 years ago

litaifang commented 6 years ago

You can probably just look at this script https://github.com/bioinform/somaticseq/blob/master/SSeq_merged.vcf2tsv.py Starting line #137 is the part of the header whose features are being used. The same is also shown at the very end of the script. Most of the header names are pretty descriptive.

litaifang commented 6 years ago

Let me know if you need more information. The number of features is closer to 100 than 70 now.