ogotoh / spaln

Genome mapping and spliced alignment of cDNA or amino acid sequences
GNU General Public License v2.0
96 stars 16 forks source link

Format of target name #60

Closed dariober closed 1 year ago

dariober commented 1 year ago

When aligning and mapping a query to a reference, it seems that spaln uses as Target name the string after | in the query name. For example:

>KAF4646005.1|foo
MGCTGSKAAAVKKPDSPEDKREANDKPQLSTGHAEALPGVVAAGGAQDSSDAKSAASLTT
...

After alignment, the Target key has value foo (i.e. Target=foo):

spaln -O:0 -Q7 -d ToxoDB-56_TgondiiRH88_Genome KAF4646005.faa
##gff-version   3
##sequence-region   CM023082 1686529 1695744
CM023082    ALN gene    1687335 1690489 2025    -   .   ID=gene00001;Name=CM023082_1688
CM023082    ALN mRNA    1687335 1690489 2025    -   .   ID=mRNA00001;Parent=gene00001;Name=CM023082_1688
CM023082    ALN cds 1689930 1690489 1078    -   0   ID=cds00001;Parent=mRNA00001;Name=CM023082_1688;Target=foo 1 187 +
CM023082    ALN cds 1689053 1689225 365 -   1   ID=cds00002;Parent=mRNA00001;Name=CM023082_1688;Target=foo 188 244 +
CM023082    ALN cds 1688502 1688604 242 -   2   ID=cds00003;Parent=mRNA00001;Name=CM023082_1688;Target=foo 245 279 +
CM023082    ALN cds 1687839 1687987 298 -   1   ID=cds00004;Parent=mRNA00001;Name=CM023082_1688;Target=foo 280 328 +
CM023082    ALN cds 1687335 1687450 160 -   2   ID=cds00005;Parent=mRNA00001;Name=CM023082_1688;Target=foo 329 366 +

I would have expected Target to be the full name KAF4646005.1|foo. Is this behaviour documented? Can it be changed? I would prefer to have as target the sequence name up to the first blank space.

ogotoh commented 1 year ago

Dear Dario,

Accoding to your request, I have created the new option, -pF, to show full entry name rather than the last term separated by vergical bars.

Osamu,