Closed Dongho1234 closed 3 years ago
What do you mean by "annotation output with nucleotide sequences?"?
We have GFF3 output (see https://github.com/ncbi/pgap/wiki/Output-Files)
what i mean by that, for example this is an output file (faa) with protein sequences, but instead, i want to get an output (protein products annotated on the genome in FASTA format) with nucleotide suequences. not amino acid seuqnce
Ex)
gnl|extdb|pgaptmp_001860 D-alanyl-D-alanine carboxypeptidase/D-alanyl-D-alanine-endopeptidase [Bacillus subtilis] AAAATTGGGCCCCCC~~~~
GFF3 file contains location of proteins on nucleotides. It can be used to produce nucleotide substrings.
As Azat wrote, your best bet is to get the coordinates from the GFF and then extract the corresponding nucleotide sequences from the genomic fasta. We will consider your request for a nucleotide file of annotated features in future versions of PGAP though.
HI, I have annotation output (faa file) with protein sequences. But is there anyway I can get annotation output with nucleotide sequences?