It is possible that mutations in highly conserved regions between human and mouse are not included in our final VCF file. Reads with these variants are likely to be assigned to "ambiguous" from ngs_disambiguate
As @huqiwen0313 noted in #24:
The reason I think these regions would be interesting is if variants appear in the conserved regions, they are more likely to be functionally related. I did not see a database list all of the cancer-associated variance in the conserved regions, but there are some related papers (e.g. https://www.nature.com/articles/srep22124, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3919555/).
It will be good to run all downstream steps in the ambiguous sets as was performed with human.
I will close this issue for now, but consider revisiting if deemed necessary in the future. Currently, only a small proportion of reads were mapped ambiguous
It is possible that mutations in highly conserved regions between human and mouse are not included in our final VCF file. Reads with these variants are likely to be assigned to "ambiguous" from
ngs_disambiguate
As @huqiwen0313 noted in #24:
It will be good to run all downstream steps in the ambiguous sets as was performed with human.