FePhyFoFum / phyx

phylogenetics tools for linux (and other mostly posix compliant) computers
blackrim.org
GNU General Public License v3.0
111 stars 17 forks source link

pxrlt and pxrls only change the final name on the list #170

Closed VaneMorales closed 2 years ago

VaneMorales commented 2 years ago

Hi! I am new to using this tool and I have been trying to change the names on a tree and a sequence file but every time I try I get the same result. The only name that changes is the last one on the list (Senecio otites V03). Does anyone know why and I can solve this? Tree_new_names Aligment_new_names

josephwb commented 2 years ago

Hi @VaneMorales. Can you please provide example input files and commands you are using? I will take a look at it.

VaneMorales commented 2 years ago

Hi, Many thanks for answering so quickly :) Here I am attaching the command line and the input files when I use pxrlt

pxrlt -t ./ITS/ITS_result.raxml.bestTree -c ./ITS/ITS_numbers_s_otites.txt -n ./ITS/ITS_names_s_otites.txt -o ./ITS/ITS_bestTree_renamed

ITS_result.raxml.zip

josephwb commented 2 years ago

Ok, I will look at this today. Sorry for the problems.

josephwb commented 2 years ago

The problem is it is not matching the names in the file. You can use the verbose option (-v) to get more information:

$ pxrlt -c ITS_numbers_s_otites.txt -n ITS_names_s_otites.txt -t ITS_result.raxml.bestTree -v
The following names to match were not found in the tree:
EF538363.1
GU818665.1
GU818666.1
GU818667.1
GU818668.1
GU818669.1
GU818670.1
Senecio_otites_V1_ITS_consensus_sequence
Senecio_otites_V2_ITS_consensus_sequence
Senecio_otites_V6_ITS_consensus_sequence
Senecio_otites_V7_ITS_consensus_sequence
(((GU818669.1:0.005328,GU818670.1:0.003992):0.000001,(EF538363.1:0.005434,((Senecio_otites_V7_ITS_consensus_sequence:0.000001,(Senecio_otites_V2_ITS_consensus_sequence:0.002144,Senecio_otites_V03:0.000001):0.002113):0.000001,Senecio_otites_V1_ITS_consensus_sequence:0.000001):0.000001):0.000001):0.000001,(GU818666.1:0.001328,(Senecio_otites_V6_ITS_consensus_sequence:0.000001,GU818665.1:0.00133):0.00133):0.000001,(GU818668.1:0.005327,GU818667.1:0.002661):0.000001);

But those names are in the tree, so I will have to track down what is going wrong with the matching.

VaneMorales commented 2 years ago

Many thanks for looking at this. I was thinking it was because the software added a space after the name when the alignment is exported (*.phy) , but the space does not appear on the tree or in the input files (current and new names of the sequences). It's not about the length of the names used either.

josephwb commented 2 years ago

Hey @VaneMorales. Sorry about the delay on this. I have a chunk of time later today when I'll tackle this... or die trying (^_-)≡☆

josephwb commented 2 years ago

Sorry for the delay.

It turns out that the problem is that the input files have Windows line Carriage Returns (CR) rather than the expected unix Line Feeds (LF). The result is that all the names to match (except the last one!) each contain a line break in the name, which is why they were not matching.

Ideally we would handle all types of line endings, and hope to at some point, but for now the easiest thing to do is convert your files before running the commands. If you are working on linux you can use the program dos2unix. On Ubuntu you would install it with:

sudo apt-get install dos2unix

When you are in a directory you can convert all of the files to unix with the command:

dos2unix *

If you want to keep the Windows-formatted files, either be careful where you run the command, or make backups of your files.

Once converted, the files run correctly:

$ pxrlt -c ITS_numbers_s_otites_unix.txt -n ITS_names_s_otites_unix.txt -t ITS_result.raxml.bestTree
(((Senecio_otites_C05:0.005328,Senecio_otites_C06:0.003992):0.000001,(Senecio_otites_Plowman_2627:0.005434,((Senecio_otites_V07:0.000001,(Senecio_otites_V02:0.002144,Senecio_otites_V03:0.000001):0.002113):0.000001,Senecio_otites_V01:0.000001):0.000001):0.000001):0.000001,(Senecio_otites_C02:0.001328,(Senecio_otites_V06:0.000001,Senecio_otites_C01:0.00133):0.00133):0.000001,(Senecio_otites_C04:0.005327,Senecio_otites_C03:0.002661):0.000001);

Sorry for the headache!

VaneMorales commented 2 years ago

Thank god you are not dead!!! I was starting to be worried. Many thanks for looking into this and for pointing the solution to my problem (I have had several issues as I am running these analyses in Ubuntu but I had managed to fix them until now), so I am very grateful for your help :)

josephwb commented 2 years ago

(>ᴗ•)