evolbioinfo / goalign

Goalign is a set of command line tools and an API to manipulate multiple sequence alignments. It is implemented in Go language.
GNU General Public License v2.0
71 stars 8 forks source link

nexus token not recognised #3

Closed cokelaer closed 5 years ago

cokelaer commented 5 years ago

When converting the nexus example provided here below, we got this error message using the latest version of goalign::

Unknown token "" in taxlabel list

We are not sure whether this is a goalign issue or the following Nexus file format that is incorrect. However, it seems it is correct since it can be converted to fasta using squizz tool for instance or biopython

#NEXUS
[ Title fasta file]
begin taxa;
   dimensions ntax= 6;
   taxlabels
      read0
      read1
      read2
      read3
      read4
      read5
;
end;
begin characters;
   dimensions nchar= 100;
   format missing=? gap=- matchchar=. datatype=nucleotide interleave=yes;
   matrix

[!Domain=Data;]
read0 GCATGCTACCCCCGACTTCGAGGCTGGGTGGAGTTACCTTTAGGGGGGGTTGTGTGGGAGTGATAGGAGAGAGACCCAGACAGTATAGCATGTTGTTGGC
read1 AATCATCC.ATAT.C.G.TTCA.G..TTC.CGA.GTTACA..A.T.TCCGGTC.G.TACCGCTG.CCC..ACTTTATTT.TGACTC.AGT..G.CAACAG
read2 .G.A..CCA.G..ACGG..TT.ATAC.AATTTT.CTAA.GGCTATCCCTACA.AACCT.ACCGGGCAT.TA.TGTGTCAC.GT.G.TT.GACG.AAA.AG
read3 ATCCCGCT.GATG.G.C..ATT..GTCCACT....GATC..CT..A.TA..TA.GAAAGCAAG..AACTCCTTGTA..A.T.AAGATCTTA.A..GGCAT
read4 AGGGATG.A.GAT.CTCG.AGTTGAT.C.CAGAAGTG.CA.T.C..TA.AAACAAAT.TTCCCAGATT.TTGACTGATA.GTAGGACCTCA..C..GACT
read5 TG.GA.ATTTGAG.GAGC...TTAGATTAT..TGCCGTCAATCAT.A.CGCAC.GTTTTAACGCCTT.ATCTCC.GACTCTCAC.GCTA.CCCGTGA.CT
;
end;
fredericlemoine commented 5 years ago

You can try this release : v0.3.1-alpha2 It should solve the issue