griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
138 stars 59 forks source link

CannotConvertValue: 125,142 cannot be converted to Integer, keeping as string #1120

Closed dincerkilic closed 2 months ago

dincerkilic commented 3 months ago

Installation Type

Standalone

pVACtools Version / Docker Image

4.2.1

Python Version

Python 3.11.6

Operating System

No response

Describe the bug

Hello, I use the comand below to annotate my VCF file. No problem occurs at this step. ../ensembl-vep/vep -i BSWC_NMO_P61.final.vcf -o out_vep/out_BSWC_NMO_P61.final.vcf --format vcf --vcf --symbol --terms SO --tsl --biotype --hgvs --offline --cache --dir_cache /home/ubuntu/.vep/ --plugin Frameshift --plugin Wildtype --dir_plugins /home/ubuntu/.vep/Plugins/ --species mus_musculus --fasta ../../İndirilenler/mm10.fa However, I get the output below while running pvac run with the command below:

pvacseq run out_vep/out_BSWC_NMO_P61.final.vcf BSWC_NMO_P61 H-2-Db,H-2-Dd,H-2-Dk,H-2-Dp,H-2-Kb,H-2-Kd,H-2-Kk,H-2-Kq,H-2-Ld,H-2-Lq,H2-IAb,H2-IAd,H2-IEd BigMHC_EL BigMHC_IM DeepImmuno NNalign NetMHC NetMHCIIpan NetMHCIIpanEL NetMHCcons NetMHCpan NetMHCpanEL SMM SMMPMBEC SMMalign out_pvac -t 6

Executing MHC Class I predictions Converting .vcf to TSV /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 125,137 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170202 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170895 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000172714 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000240589 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 65976308 GT G ENSMUST00000080665 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278973 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278994 C A ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279077 GCCCGAC G ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279086 TGGCATAAAGTGCA T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279719 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279759 T C ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279796 C T ENSMUST00000134011 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000001548 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000107739 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000120375 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 127,142 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000119866 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120956 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120984 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000183698 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000184703 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 137,127 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000023491 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000135193 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000165616 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170201 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170899 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 24936490 ACACCGTCCCT A ENSMUST00000234399 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000004985 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000156029 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232705 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232997 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233561 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233646 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233701 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233709 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr18 60845879 G GGA ENSMUST00000235795 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr19 41878204 A AC ENSMUST00000038677 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000027178 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000163058 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 121483419 T TAA ENSMUST00000001724 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr2 86355798 T C ENSMUST00000213225 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 125,142 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121294 A AG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121295 A AAG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000071829 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000120591 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000135248 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000200390 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000202283 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr5 120959415 T C ENSMUST00000183291 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 121,137 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 142,105 cannot be converted to Integer, keeping as string. warnings.warn( WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382659 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382773 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382810 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382825 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382867 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382882 A T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382888 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382948 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383304 G T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383352 TAA T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383400 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383433 T C ENSMUST00000072887 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 137,123 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000070064 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000072585 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126032 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126065 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000248980 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000248980 Completed Converting VCF to TSV frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170202 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170895 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000172714 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000240589 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 65976308 GT G ENSMUST00000080665 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278973 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278994 C A ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279077 GCCCGAC G ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279086 TGGCATAAAGTGCA T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279719 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279759 T C ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279796 C T ENSMUST00000134011 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000001548 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000107739 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000120375 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000119866 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120956 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120984 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000183698 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000184703 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000023491 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000135193 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000165616 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170201 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170899 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 24936490 ACACCGTCCCT A ENSMUST00000234399 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000004985 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000156029 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232705 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232997 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233561 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233646 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233701 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233709 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr18 60845879 G GGA ENSMUST00000235795 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr19 41878204 A AC ENSMUST00000038677 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000027178 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000163058 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 121483419 T TAA ENSMUST00000001724 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr2 86355798 T C ENSMUST00000213225 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121294 A AG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121295 A AAG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000071829 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000120591 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000135248 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000200390 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000202283 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr5 120959415 T C ENSMUST00000183291 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382659 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382773 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382810 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382825 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382867 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382882 A T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382888 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382948 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383304 G T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383352 TAA T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383400 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383433 T C ENSMUST00000072887 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000070064 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000072585 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126032 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126065 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000248980 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000248980 Completed Generating Variant Peptide FASTA and Key File ERROR: There was a mismatch between the actual wildtype amino acid sequence (H) and the expected amino acid sequence (D). Did you use the same reference build version for VEP that you used for creating the VCF? {'chromosome_name': 'chr10', 'start': '4403744', 'stop': '4403745', 'reference': 'G', 'variant': 'A', 'gene_name': 'Armt1', 'transcript_name': 'ENSMUST00000095893', 'transcript_support_level': 'Not Supported', 'transcript_length': '439', 'biotype': 'protein_coding', 'amino_acid_change': 'D/N', 'codon_change': 'Gat/Aat', 'ensembl_gene_id': 'ENSMUSG00000061759', 'hgvsc': 'ENSMUST00000095893.11:c.829G>A', 'hgvsp': 'ENSMUSP00000093581.5:p.Asp277Asn', 'wildtype_amino_acid_sequence': 'MAESPAFLSAKDEGSFAYLTIKDRTPQILTKVIDTLHRHKSEFFEKHGEEGIEAEKKAISLLSKLRNELQTDKPITPLVDKCVDTHIWNQYLEYQRSLLNEGDGEPRWFFSPWLFVECYMYRRIHEAIMQSPPIHDFDVFKESKEENFFESQGSIDALCSHLLQLKPVKGLREEQIQDEFFKLLQISLWGNKCDLSLSGGESSSQKANIINCLQDLKPFILINDTESLWALLSKLKKTVETPVVRVDIVLDNSGFELITDLVLADFLFSSELATEIHFHGKSIPWFVSDVTEHDFNWIVEHMKSSNLESMSTCGACWEAYARMGRWAYHDHAFWTLPHPYCVMPQVAPDLYAELQKAHLILFKGDLNYRKLMGDRKWKFTFPFHQALSGFHPAPLCSIRTLKCELQVGLQPGQAEQLTASDPHWLTTGRYGILQFDGPL', 'frameshift_amino_acid_sequence': '', 'fusion_amino_acid_sequence': '', 'variant_type': 'missense', 'protein_position': '277', 'transcript_expression': 'NA', 'gene_expression': 'NA', 'normal_depth': 'NA', 'normal_vaf': 'NA', 'tdna_depth': '12', 'tdna_vaf': '1.0', 'trna_depth': 'NA', 'trna_vaf': 'NA', 'index': '2.Armt1.ENSMUST00000095893.missense.277D/N', 'protein_length_change': '', 'fusion_read_support': 'NA', 'fusion_expression': 'NA'}

Then I run; ref-transcript-mismatch-reporter out_vep_BSWC_NMO_P61/out_BSWC_NMO_P61.final.vcf --filter soft

And output below:

/home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 125,137 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 127,142 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 137,127 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 125,142 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 121,137 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 142,105 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 137,123 cannot be converted to Integer, keeping as string. warnings.warn( INFO:root: Total number of variants: 151975 Total number of processable variants (at least one missense, inframe indels, or frameshift transcript): 3476 Total number of variants with mismatched annotations: 2257 Percentage of processable variants with mismatched annotations: 64.93% Percentage of variants with mismatched annotations: 1.49% Total number of transcripts: 531382 Total number of processable transcripts (missense, inframe indels, frameshifts): 7808 Total number of transcripts with mismatched annotations: 5224 Percentage of processable transcripts with mismatched annotations: 66.91% Percentage of all transcripts with mismatched annotations: 0.98%

With the out_BSWC_NMO_P61.final.filtered.vcf file, I run pvac run again and I get error:

Executing MHC Class I predictions Converting .vcf to TSV /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 125%2C137 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170202 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170895 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000172714 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000240589 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 65976308 GT G ENSMUST00000080665 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278973 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278994 C A ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279077 GCCCGAC G ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279086 TGGCATAAAGTGCA T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279719 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279759 T C ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279796 C T ENSMUST00000134011 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000001548 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000107739 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000120375 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 127%2C142 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000119866 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120956 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120984 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000183698 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000184703 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 137%2C127 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000023491 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000135193 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000165616 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170201 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170899 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 24936490 ACACCGTCCCT A ENSMUST00000234399 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000004985 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000156029 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232705 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232997 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233561 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233646 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233701 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233709 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr18 60845879 G GGA ENSMUST00000235795 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr19 41878204 A AC ENSMUST00000038677 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000027178 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000163058 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 121483419 T TAA ENSMUST00000001724 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr2 86355798 T C ENSMUST00000213225 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 125%2C142 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121294 A AG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121295 A AAG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000071829 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000120591 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000135248 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000200390 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000202283 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr5 120959415 T C ENSMUST00000183291 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 121%2C137 cannot be converted to Integer, keeping as string. warnings.warn( /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 142%2C105 cannot be converted to Integer, keeping as string. warnings.warn( WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382659 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382773 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382810 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382825 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382867 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382882 A T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382888 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382948 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383304 G T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383352 TAA T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383400 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383433 T C ENSMUST00000072887 /home/ubuntu/environments/my_env/lib/python3.11/site-packages/vcfpy/parser.py:251: CannotConvertValue: 137%2C123 cannot be converted to Integer, keeping as string. warnings.warn( frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000070064 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000072585 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126032 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126065 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000248980 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000248980 Completed Converting VCF to TSV frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170202 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000170895 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000172714 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 59076020 TG T ENSMUST00000240589 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 65976308 GT G ENSMUST00000080665 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278973 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73278994 C A ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279077 GCCCGAC G ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279086 TGGCATAAAGTGCA T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279719 C T ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279759 T C ENSMUST00000134011 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr11 73279796 C T ENSMUST00000134011 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000001548 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000107739 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr11 94946682 G GC ENSMUST00000120375 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000119866 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120956 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000120984 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000183698 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr14 21783914 GTTGTT G ENSMUST00000184703 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000023491 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000135193 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000165616 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170201 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr16 32826033 A ACAGCTGGGACATAGGCCTTTAAGAGACAACAGGG ENSMUST00000170899 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 24936490 ACACCGTCCCT A ENSMUST00000234399 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000004985 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000156029 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232705 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000232997 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233561 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233646 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233701 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr17 29036165 CT C ENSMUST00000233709 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr18 60845879 G GGA ENSMUST00000235795 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr19 41878204 A AC ENSMUST00000038677 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000027178 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 59206622 TAGAC T ENSMUST00000163058 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr1 121483419 T TAA ENSMUST00000001724 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr2 86355798 T C ENSMUST00000213225 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121294 A AG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr3 123121295 A AAG ENSMUST00000047923 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000071829 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000120591 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000135248 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000200390 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr5 32897969 CAT C ENSMUST00000202283 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr5 120959415 T C ENSMUST00000183291 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382659 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382773 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382810 T G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382825 A G ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382867 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382882 A T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382888 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104382948 C T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383304 G T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383352 TAA T ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383400 T C ENSMUST00000072887 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr7 104383433 T C ENSMUST00000072887 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000070064 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 86440558 ATT A ENSMUST00000072585 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126032 frameshift_variant transcript does not contain a FrameshiftSequence. Skipping. chr9 106430117 CT C ENSMUST00010126065 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060425 G A ENSMUST00000248980 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000173185 WARNING:root:Transcript WildtypeProtein sequence contains internal stop codon. These can occur in Ensembl transcripts of the biotype polymorphic_pseudogene. Skipping. chr9 119060472 CG C ENSMUST00000248980 Completed Generating Variant Peptide FASTA and Key File ERROR: There was a mismatch between the actual wildtype amino acid sequence (H) and the expected amino acid sequence (D). Did you use the same reference build version for VEP that you used for creating the VCF? {'chromosome_name': 'chr10', 'start': '4403744', 'stop': '4403745', 'reference': 'G', 'variant': 'A', 'gene_name': 'Armt1', 'transcript_name': 'ENSMUST00000095893', 'transcript_support_level': 'Not Supported', 'transcript_length': '439', 'biotype': 'protein_coding', 'amino_acid_change': 'D/N', 'codon_change': 'Gat/Aat', 'ensembl_gene_id': 'ENSMUSG00000061759', 'hgvsc': 'ENSMUST00000095893.11:c.829G>A', 'hgvsp': 'ENSMUSP00000093581.5:p.Asp277Asn', 'wildtype_amino_acid_sequence': 'MAESPAFLSAKDEGSFAYLTIKDRTPQILTKVIDTLHRHKSEFFEKHGEEGIEAEKKAISLLSKLRNELQTDKPITPLVDKCVDTHIWNQYLEYQRSLLNEGDGEPRWFFSPWLFVECYMYRRIHEAIMQSPPIHDFDVFKESKEENFFESQGSIDALCSHLLQLKPVKGLREEQIQDEFFKLLQISLWGNKCDLSLSGGESSSQKANIINCLQDLKPFILINDTESLWALLSKLKKTVETPVVRVDIVLDNSGFELITDLVLADFLFSSELATEIHFHGKSIPWFVSDVTEHDFNWIVEHMKSSNLESMSTCGACWEAYARMGRWAYHDHAFWTLPHPYCVMPQVAPDLYAELQKAHLILFKGDLNYRKLMGDRKWKFTFPFHQALSGFHPAPLCSIRTLKCELQVGLQPGQAEQLTASDPHWLTTGRYGILQFDGPL', 'frameshift_amino_acid_sequence': '', 'fusion_amino_acid_sequence': '', 'variant_type': 'missense', 'protein_position': '277', 'transcript_expression': 'NA', 'gene_expression': 'NA', 'normal_depth': 'NA', 'normal_vaf': 'NA', 'tdna_depth': '12', 'tdna_vaf': '1.0', 'trna_depth': 'NA', 'trna_vaf': 'NA', 'index': '2.Armt1.ENSMUST00000095893.missense.277D/N', 'protein_length_change': '', 'fusion_read_support': 'NA', 'fusion_expression': 'NA'}

Do you have an idea how to fix the problem? Thank you in advance!

How to reproduce this bug

In description.

Input files

No response

Log output

No available

Output files

No response

susannasiebert commented 3 months ago

Did you use the --pass-only flag when running pVACseq the second time with your soft-filtered VCF? Otherwise pVACseq will still try to process all entries, irregardless of filter status.

Either way I would be concerned about the unusually high mismatch rate the ref-transcript-mismatch-reporter reported (Percentage of processable transcripts with mismatched annotations: 66.91%). That means of all variant types that pVACseq supports, 66.91% have a mismatched transcript which might indicate that there could be a mismatch in the references used at some point in your pipeline. I would carefully investigate your pipeline to ensure that the same ensembl version, reference build version, and reference fasta are being used throughout. If all those are consistent throughout variant calling and annotation, these mismatches should be rare to non-existent. This could potentially also explain the high number of frameshift_variant transcript does not contain a FrameshiftSequence warnings you are seeing. Once you confirm that all your reference data is indeed consistent throughout your pipeline, moving forward with the filtered VCF would indeed be the right approach.

dincerkilic commented 3 months ago

Thank you very much for your response. I will correct them and try it again!