I tried to get proteins' sequences of TCGA MAF file. So, I convert maf file to vcf by maf2vcf.pl.
Then, I used vcf-to-proteindb. I also downloads GDC Reference Files: GRCh38.d1.vd1 Reference Sequence and GDC.h38 GENCODE v36 GTF for this.
It returned incorrect amino acid sequences for some mutation.
this is the code I wrote:
python ../py-pgatk/pypgatk/pypgatk_cli.py vcf-to-proteindb --vcf TCGA-06-A5U1-01A-11D-A33T-08.vcf --input_fasta input_fasta.fa --gene_annotations_gtf gencode.v36.annotation.gtf --annotation_field_name '' --output_proteindb var_peptides.fa
Hello Dear developers of py-pgatk,
I tried to get proteins' sequences of TCGA MAF file. So, I convert maf file to vcf by maf2vcf.pl. Then, I used vcf-to-proteindb. I also downloads GDC Reference Files: GRCh38.d1.vd1 Reference Sequence and GDC.h38 GENCODE v36 GTF for this. It returned incorrect amino acid sequences for some mutation.
this is the code I wrote: python ../py-pgatk/pypgatk/pypgatk_cli.py vcf-to-proteindb --vcf TCGA-06-A5U1-01A-11D-A33T-08.vcf --input_fasta input_fasta.fa --gene_annotations_gtf gencode.v36.annotation.gtf --annotation_field_name '' --output_proteindb var_peptides.fa
How can I fix this problem?
Thanks.