nexml / nexml.java

Java API for NeXML.
MIT License
4 stars 6 forks source link

Mesquite plugin does not write state labels #1

Closed balhoff closed 12 years ago

balhoff commented 12 years ago

In a standard categorical matrix, the states for each character are typically different, even though they may use common symbols such as 0 and 1. For example, in this NEXUS CHARSTATELABELS block, characters 12 and 13 do not share labels for their states:

12 lacrimal_orbital_processes / only_ventral_present dorsal_and_ventral_present, 13 Quadratojugal / present absent,

When Mesquite saves this matrix as NeXML, it creates one 'states' element which is shared between all the characters. It should instead create a separates 'states' element for use by each character. Further, the state labels should be written into the 'state' elements.

NEXUS file: https://gist.github.com/3208772

rvosa commented 12 years ago

As an additional comment to this issue - care needs to be taken to distinguish between small categorical matrices, where it makes sense to have separate states elements for each column, and large nucleotide alignments, where the same states element should be re-used for all columns.

balhoff commented 12 years ago

Fixed by 04358b6e40ab8b1930ed36fa62f299e566b796dc.