Closed maressyl closed 4 years ago
Hello Sylvain,
You have found a really serious bug. Indeed, matrix rows are in different order than in FASTA file (its because originally we needed only the distribution of distances so we didn't care about the ordering and somehow we forgot about it when pushing the changes). I'll fix this urgently. Thanks for reporting and sorry for your wasted time for tracking the bug.
Regards, Adam
Hi,
I added the bugfix in the experimental
branch in the repository (1.5.12 release). Now, the matrix rows are named after the sequences and are in the same order as in FASTA. Btw, the command line for producing matrix has been simplified in this release. This is because the guide tree and alignment are not produced when using -dist_export
switch and output file is used for storing the matrix. In your case it would be:
famsa-1.5.12-linux-static -dist_export test.fa matrix.out
Please let me know if it works.
Regards, Adam
Dear Adam,
Thanks a lot for your answer and this quick fix, it seems the problem is solved, at least on the example data I provided. I will continue to play a bit with FAMSA and let you know if I encounter another problem.
Best regards, Sylvain
Bugfix has been incorporated to master
branch.
Hi,
I find FAMSA very promising, however I am a bit frustrated with the distance matrix returned with
-dist_export
in the last release. How are sequences ordered in the output ? Building a tree in R from this matrix I easily identify identical sequences, however it is clear rows are not sorted as the provided FASTA file was, as identical sequences are now next to each others.Is there a way to presort my FASTA file to match the matrix ordering, or to add labels in the distance matrix ?
Best regards, Sylvain
PS : Here is the command and input file I use
famsa-1.3.2-linux-static -gt upgma -dist_export out.dist -gt_export out.newick test.fa out.fa
test.zip