greenelab / pdx_exomeseq

Pipeline analysis for whole exome sequencing of pancreatic cancer PDX models
MIT License
21 stars 14 forks source link

Call Variants in "Ambiguous" assigned reads #26

Closed gwaybio closed 6 years ago

gwaybio commented 6 years ago

It is possible that mutations in highly conserved regions between human and mouse are not included in our final VCF file. Reads with these variants are likely to be assigned to "ambiguous" from ngs_disambiguate

As @huqiwen0313 noted in #24:

The reason I think these regions would be interesting is if variants appear in the conserved regions, they are more likely to be functionally related. I did not see a database list all of the cancer-associated variance in the conserved regions, but there are some related papers (e.g. https://www.nature.com/articles/srep22124, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3919555/).

It will be good to run all downstream steps in the ambiguous sets as was performed with human.

gwaybio commented 6 years ago

I will close this issue for now, but consider revisiting if deemed necessary in the future. Currently, only a small proportion of reads were mapped ambiguous