zhouzilu / DENDRO

Genetic Heterogeneity Profiling by Single Cell RNA Sequencing
GNU General Public License v3.0
34 stars 6 forks source link

Extracting AD (read depth for each allele) in vcf_to_DENDROinput.R might not cover all mutated alleles #15

Closed Qirong-Lin closed 3 years ago

Qirong-Lin commented 3 years ago

Hi Zilu,

Excellent tool for subclone clustering! I'm trying to integrate in my project and just checked the vcf_to_DENDROinput.R. In the following function: exinfo_x <- function(info){ return(sapply(strsplit(info,':'),function(x){ifelse(x[GT_pos]=='./.',NA,as.numeric(strsplit(x[AD_pos],',')[[1]][2]))},simplify=T)) }

It would only extract the read depth of the second allele presented in the .vcf file, but in my case, there is a sample with 1/2:0,5,8:13:99:666,299,267,218,0,187.

It has two different alleles and all have a certain level of expression. But through the script, the second allele won't be counted.

Maybe we could add all except allele 0?

Looking forward to discussing more!

zhouzilu commented 3 years ago

Hey Qirong,

Excellent question. Multi-ploidy is a common issue in cancer. Unfortunately, when we model the tumor in DENDRO, we didn't really consider the multiploidy and multi-allele phylogeny inference. That's why we picked a single mutation (random mutation of the two) in our case. However, you are more than welcome to define your own function and sum them up, which makes a lot of sense. In the future, maybe we shall work on a new version that is able to capture these corner cases too. Best, Zilu

Qirong-Lin commented 3 years ago

Sounds great! Thanks!

Qirong