phyloref / phylo2owl

Tool to convert phylogenies to OWL ontologies
MIT License
4 stars 2 forks source link

Crowl et al. experiment #23

Closed gaurav closed 7 years ago

gaurav commented 7 years ago

This pull request expands our test suite by integrating three phylogenetic trees from Crowl et al., 2014 -- the Plastid ML tree, the PPR ML tree and the Plastid plus PPR ML tree. Many of the phyloreferences from the Plastid plus PPR ML tree applied to the Plastid ML tree without many changes, but only one of those phyloreferences -- after radical changes -- applied to the PPR ML tree. Together, I think this is a pretty good starting set to figure out how to test phyloreferences across different phylogenies, and how to write phyloreferences so that they'll work on different phylogenies. In testing these phyloreferences and getting them to work, I also improved how the test suite reports failed reasoner tests.

Since it consists of three commits that are pretty well organized, I think we should rebase these changes in instead of squashing this pull request.

gaurav commented 7 years ago

@hlapp Thanks for the comments!

There are clearly more than 3 commits in this PR. Is the 3 in error, or are you only referring to a subset of them?

Argh, yes, I meant "six" (although there are seven now, because I realized that I'd forgotten to get rid of the "Node that ..."). I probably got it confused with the number of phylogenies being added (three). Sorry for the confusion!

Having only few commits in a PR doesn't mean that can't still be the case, so the consideration whether or not it's the case shouldn't be triggered by exceeding N commits.

Agreed! My commits tend to be many and haphazard, and then right before I make the pull request, I reorganize them by purpose using a rebase. That's what I was trying to convey when I said that "three [sic] commits that are pretty well organized" -- I guess may be I should have just said "These comments are well-organized and I think convey useful information when considered separately, and so should be rebased instead of squashed"?

Why are you modifying the phyloreferences because of the nature of the phylogeny?

Because my original phyloreferences refer to leafs that don't exist in some of the other phylogenies! So I guess I need to go back and rewrite those phyloreferences with leaves that exist in all our phylogenies, or we need to provide the reasoner with information on where those leaves went (e.g. that "Wahlenbergia angustifolia" may be contained within "Wahlenbergia sp"). In one case, the phyloreference works, but is hard to test because it includes so much of the phylogeny. I agree that I should have documented this better, so I went ahead and did that.

I'd like to incorporate these changes now, then go work on phyloderef before rewriting these phyloreferences, since phyloderef is a much better place to build and test phyloreferences than the test suite. But I could also do it the other way around -- let me know which you'd prefer, or we can discuss this on Friday!