sate-dev / sate-core

3 stars 3 forks source link

intermediate trees are outputted with internal taxon names #14

Closed smirarab closed 11 years ago

smirarab commented 12 years ago

It seems intermediate trees are outputted with internal taxon names. Is this intended behavior or a bug?

I do see that we are also outputting the taxon name mapping now, which is a very good idea. But I thought that it would make more sense if we output intermediate trees with normal taxon names as well.

mtholder commented 12 years ago

Hi, That was intentional. The motivation was to allow the files to be useable with a wider variety of software (many programs have fairly severe restrictions on taxon names).

The primary motivation was making it easier to do things like run bootstrapping in RAxML on the concatenated files (to do that effectively, we should also output the partition file for each run; I intend to do that soon).

I'll concede that it doesn't make as much sense when using FastTree (because the fasta formatted files will tolerate the normal, non-"safe" names).

Let's keep this as an open, but low-priority, issue to be dealt with later.

thanks, Mark

On May 8, 2012, at 4:11 PM, Siavash Mirarab wrote:

It seems intermediate trees are outputted with internal taxon names. Is this intended behavior or a bug?

I do see that we are also outputting the taxon name mapping now, which is a very good idea. But I thought that it would make more sense if we output intermediate trees with normal taxon names as well.


Reply to this email directly or view it on GitHub: https://github.com/sate-dev/sate-core/issues/14

smirarab commented 12 years ago

sure, since the mapping is outputted the user can always map the taxon names.

This makes me think, maybe it'd be good to provide a name mapping service in SATe. This would read the outputted name map and generate new a alignment/tree. Alternatively, we can bundle a simple script with SATe that reads the mapping file and an alignment/tree and performs a name mapping.

mtholder commented 12 years ago

I agree that a generic name mapping service would be a good feature. This would actually be useful in a large number of contexts in biodiversity informatics. Perhaps we can discuss this at the upcoming NESCent hackathon.

joaks1 commented 11 years ago

I am closing this issue, because the use of "safe names" is an intended feature.