alife-data-standards / alife-data-standards

Repository to host data standards for the ALIFE community.
https://alife-data-standards.github.io/alife-data-standards/
MIT License
15 stars 2 forks source link

Row order in phylogeny data standard? #18

Closed mmore500 closed 1 year ago

mmore500 commented 1 year ago

Realized today that I had been making the implicit assumption that the phylogeny alife standard data tables I'm operating on are ordered from top (most ancient) to bottom (most recent) in a topologically sorted manner. In other words, that parent entries always appear before their offspring.

On a closer read through of the website, I don't think this convention is directly stated anywhere. However, all the alife standard phylogeny data sets I've encountered are arranged in a topologically sorted manner.

Being able to assume that phylogeny data sets are sorted in this manner would simplify some operations, like identifying tree roots (which would always be at the top of the file).

Should this be added as a standard requirement? Would this invalidate any existing tools or datasets? Can we assume all relevant phylogenies will be topologically sortable?

mmore500 commented 1 year ago

Took a look through some more example data sets from different sources and many are not topologically ordered. This would be a nontrivial, breaking change and would be better suited to an additional requirement and not part of the base standard.