Closed 51mystic closed 10 months ago
Added code execution error information: Executing file "matrix-80-align-concatenate-nexus.nexus" UNIX line termination Longest line length = 5823723 Parsing file Expecting NEXUS formatted file Reading data block Allocated taxon set Allocated matrix Defining new matrix with 7 taxa and 5823716 characters Data is Dna Missing data coded as ? Gaps coded as - Taxon 1 -> LD10A Error while parsing a string. Token "tctttgatctacctggcaac...[followed by at least 99970 more charectors]" is too long. Maximum allowed length of a token is 99990 The error occurred when reading char. 662483-662502 on line 6 in the file 'matrix-80-align-concatenate-nexus.nexus'
Returning execution to command line ...
Error in command "Execute"
Will exit with signal 1 (error) because quitonerror is set to yes
If you want control to be returned to the command line on error,
use 'mb -i
Excuse me, teacher, how to get the interleaved sequence?
Most probably an issue with the input format: long lines (5.8 M characters). There are many ways to convert modern sequence formats, using web sites or programming languages. Below are two examples using the python scripting language (tested using Python 3.10.12 with Biopython 1.79):
Convert DNA alignment in fasta to interleaved nexus
from Bio import SeqIO
SeqIO.convert("infile.fas", "fasta", "outfile.nex", "nexus", "DNA")
Convert DNA alignment in in non-interleaved nexus to interleaved nexus
from Bio import SeqIO
SeqIO.convert("infile.nex", "nexus", "outfile2.nex", "nexus", "DNA")
Closing with comment above
Dear teacher, when I execute the code, the ### sequence is too long, resulting in ### code interruption, I refer to the error guidance, add ### interleave information, after saving, still can not run. I would like to ask you how to solve this problem. Here's what my code looks like: begin data; dimensions ntax=7 nchar=5823716; format datatype=dna missing=? gap=- interleave; matrix LD10A catgatgaaggaaattttggatattaacggggatttttttgg...... ... LD7A cgtattgaatacaacttttt---ttgttaacggggatttttttgg...... ; end; ... begin mrbayes; set autoclose=yes nowarn=yes; lset nst=6 rates=invgamma; prset statefreqpr=fixed(equal); outgroup LD10A mcmc ngen=1000000 printfreq=1000 samplefreq=100 samplefile=/mnt/data/userdata/svip019/00----outcome/mrbayes-o/myout.nex; sumt burnin=250; end; Looking forward to your answer.