phyloref / phylo2owl

Tool to convert phylogenies to OWL ontologies
MIT License
4 stars 2 forks source link

Validating the OWL representation #10

Open gaurav opened 7 years ago

gaurav commented 7 years ago

test_owl_output.py currently loads RDF/XML files produced by phylo2owl to make sure that it is valid XML, but it doesn't check to see if the file makes sense. Given how we generate the RDF/XML file, it's pretty unlikely that there will be nonsensical things introduced into the file, such as a node that claims to be its own sibling or a part of the phylogeny that is completely disjoint from other parts of the phylogeny. Such errors will probably also be caught by the reasoner -- if it notices errors in an OWL ontology generated by phylo2owl.py, we can add a check for the precise kind of logical error in the file. I think that's a better approach than spending time trying to come up with OWL errors we think likely.

If you agree, I'll rename test_owl_output.py to test_rdf_output.py and call phylo2owl.py done for now. However, if you have any ideas on how we can test the RDF output, please suggest them here or we can discuss them on our next phone call!

hlapp commented 7 years ago

Here are some particularly relevant references re: validation of RDF graphs according to stated constraints and expectations that I meant so send you in case you aren't aware of them already:

  1. Prud’hommeaux, Eric, Jose Emilio Labra Gayo, and Harold Solbrig. 2014. “Shape Expressions: An RDF Validation and Transformation Language.” In Proceedings of the 10th International Conference on Semantic Systems, 32–40. ACM. http://dx.doi.org/10.1145/2660517.2660523 (PDF)
  2. Hansen, Jacob Baungard, Andrew Beveridge, Roisin Farmer, Leif Gehrmann, Alasdair J. G. Gray, Sunil Khutan, Tomas Robertson, and Johnny Val. 2015. “Validata: An Online Tool for Testing RDF Data Conformance.” In Proceedings of the 8th International Conference on Semantic Web Applications and Tools for Life Sciences (SWAT4LS), edited by James Malone, Robert Stevens, Kerstin Forsberg, and Andrea Splendiani, 157–66. (PDF)
  3. Bolleman, Jerven, Sebastien Gehant, and the Uni Prot Consortium. 2012. “Catching Inconsistencies with the Semantic Web: A Biocuration Case Study.” In 5th International Workshop on Semantic Web Applications and Tools for Life Sciences (SWAT4LS 2012), edited by Adrian Paschke, Albert Burger, Paolo Romano, M. Scott Marshall, and Andrea Splendiani. Vol. 952. CEUR Workshop Proceedings. (PDF).