Bioconductor / VariantAnnotation

Annotation of Genetic Variants
https://bioconductor.org/packages/VariantAnnotation
23 stars 20 forks source link

Create protein sequences including variants from a VCF file #24

Open baderd opened 5 years ago

baderd commented 5 years ago

Dear BiocTeam,

I am investigating the proteome of human cancer samples and want to insert their genetic variations into the reference proteome fasta sequences to increase the sensitivity of my peptide/protein quantification.

Can you implement this "proteomeVariantInsertion()" in the VariantAnnotation package?

The VariantAnnotation::predictCoding() function already translates codons at variant positions from a reference BSgenome object to assess the consequences of a variant. I would like to take all coding variants (or just non-synonymous SNVs for a start) and insert them into the reference proteome, then save the modified fasta file.

See also my post in the Bioconductor forum.

Thanks, Daniel

lawremi commented 5 years ago

Do you want to modify the DNA sequences or the protein sequences? I think BSgenome::injectSNPs() should get you pretty close to the former, and it wouldn't be too hard to adapt to the latter. Either way, I'm not sure this belongs in VariantAnnotation.

baderd commented 5 years ago

I agree that my request sounds odd for VariantAnnotation, but is rather fitted for manipulation of Biostrings or similar. The reason I linked my original question to this package is that predictCoding() does already everything I want except translating only single positions and not the protein isoform as a whole.

Meanwhile, I suggested a solution workflow in the Bioc forum.

karlmakepeace commented 2 years ago

"proteomeVariantInsertion()" would be very useful.