jeetsukumaran / DendroPy

A Python library for phylogenetic scripting, simulation, data processing and manipulation.
https://pypi.org/project/DendroPy/.
BSD 3-Clause "New" or "Revised" License
210 stars 61 forks source link

Make PHYLIP writing work correctly with missing taxa #33

Closed pranjalv123 closed 9 years ago

pranjalv123 commented 9 years ago

The current version of DendroPy has an issue where the following happens:

Suppose m is a DnaCharacterMatrix with some sequences:

m.write('phylip')

5 4 t1 actg t2 ctga t3 cccc t4 ctag t5 ggtg

m.discard_taxa([t1, t3])

m.write('phylip')

3 4 t1 t2 ctga t3 t4 ctag t5 ggtg

m.write('phylip')

5 4 t1 t2 ctga t3 t4 ctag t5 ggtg

because blank entries get added by the PhylipWriter to the character matrix.

This patch fixes this issue. It also adds an option to PhylipWriter to suppress printing taxa that have no sequences, which is needed when interfacing with programs like FastTree.

jeetsukumaran commented 9 years ago

Hi Pranjal,

Thanks for this. It is a useful addition. Only thing is that it needs a test. I will be happy to work up a test for this, but may not get around to it till this weekend.