smirarab / binning

Code for statistical binning and related scripts
7 stars 4 forks source link

Taxa names converted to all uppercase #2

Closed uribe-convers closed 6 years ago

uribe-convers commented 8 years ago

Dear Dr. Mirarab,

I have successfully ran this code and think it's a great approach to species tree estimation. I did, however, noticed that once the super-genes are created, the names of the taxa in the new fasta files are converted to all uppercase. Is there a reason for this?

Usually, people don't have the species names in all caps but rather in the more common binomial capitalization of Genus species (e.g., Drosophila melanogaster). Of course once the names are changed in the super-gene files—and thus in the super-gene trees—they won't match the naming capitalization of other genes/gene trees/bootstraps trees that didn't need to be binned, and programs downstream (e.g., ASTRAL II) won't recognize the taxa as the same.

This is a small request, but would it be possible to add the functionality of preserving the capitalization of the original fasta files?

Thanks a lot and thank you for creating this great program!

Best, Simon

smirarab commented 8 years ago

Can you try changing 1 to 0 in this line and see if it works?

https://github.com/smirarab/binning/blob/master/perl/concatenate_alignments.pl#L175

If it did, let me know and I will update the github.

Thanks Siavash

On Wed, Jul 20, 2016 at 10:12 AM, Simon Uribe-Convers < notifications@github.com> wrote:

Dear Dr. Mirarab,

I have successfully ran this code and think it's a great approach to species tree estimation. I did, however, noticed that once the super-genes are created, the names of the taxa in the new fasta files are converted to all uppercase. Is there a reason for this?

Usually, people don't have the species names in all caps but rather in the more common binomial capitalization of Genus species (e.g., Drosophila melanogaster). Of course once the names are changed in the super-gene files—and thus in the super-gene trees—they won't match the naming capitalization of other genes/gene trees/bootstraps trees that didn't need to be binned, and programs downstream (e.g., ASTRAL II) won't recognize the taxa as the same.

This is a small request, but would it be possible to add the functionality of preserving the capitalization of the original fasta files?

Thanks a lot and thank you for creating this great program!

Best, Simon

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/binning/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AAybuGuN8Zdsj3KnKQS7IKUIytLTwuC4ks5qXlb2gaJpZM4JRAIs .

Siavash Mirarab

uribe-convers commented 8 years ago

Yes, it works great!

Thank you, Simon