PHI-base / phipo

Pathogen-Host Interaction Phenotype Ontology
Other
5 stars 5 forks source link

preparing a "test" obo file for developers #13

Closed ValWood closed 6 years ago

ValWood commented 6 years ago

@kimrutherford @barnabynorman @martin2urban @jseager7

When @CuzickA returns it would be useful to prepare a test obo file for the developers

This only needs a small set of terms. But we plan to have 2 curation options in the tool: "pathogen only phenotype" and "pathogen host interaction phenotype"

phenotype ontology

These will be in 2 different branches of the "PHI-pathogen-phenotype ontology"

This seems to mirror GO, which is effectively 2 separate ontologies (molecular function, biological process and cellular component) in the same ontology. See http://snapshot.geneontology.org/ontology/go-basic.obo separated like so: "namespace: biological_process"

So, I wonder if we need to introduce a "namespace" into the obo file to allow use to for the 2 types of phenotype?

(Alayne this file literally only needs one or two terms for each aspect, and does not need to have logical definitions yet, ads these are only required for reasoning)

CuzickA commented 6 years ago

I am still working on this. To add namespaces in Protege I need the obo plugin which has been causing some difficulties. We planned on having 3 namespaces moving forward in the past i think we had PHI_pathogen_phenotype PHI_disease_formation_phenotype PHI_host_phenotype I will try and get these added to the current phipo

For testing purposes Alistair's basic 10 term obo file is available in the Dropbox folder but isnt separated into the above 3 namespaces https://www.dropbox.com/preview/PHI-base/Most%20recent%20PHI-base%20phenotype%20ontologies%20from%20Alistair's%20files/phenotype_outcome.obo?role=personal

Perhaps it is better to wait until i can make these changes in phipo and push to GitHub, so that the phipo GitHub link can be used for PHI-Canto testing?

CuzickA commented 6 years ago

obo plugin appears to be working better now also installed on my desktop computer. capture 20_04_2018 Have added the 3 namespaces, still to populate with a few more terms

New question for Protege preferences - do we want 'label' or auto-generated 'term id'? To look into and discuss.

CuzickA commented 6 years ago

I have amended namespaces to pathogen_phenotype disease_formation_phenotype host_phenotype

I have started adding some pathogen_phenotype terms please see screen shot below. These terms are now being separated into 'normal' and 'abnormal' phenotypes. Please give me input into the best structure of term names eg duplication of words such as 'normal' and 'phenotype' @ValWood capture 27_04_2018

This phipo.owl file is currently available here https://github.com/PHI-base/PHI-base_ontologies/blob/master/phipo.owl

Outstanding issues 1) Protege is unable to save my current phipo.owl as an .obo file due to an error message capture 27_04_2018_obo save error This may be due to an error introduced from creating document on laptop with past problem of Protege/obo adaptor or an issue with loading external ontologies (CHEBI causing difficulties). 2) Need to change preferences from 'label' to 'auto generated term id' eg PHIPO:0000001

Current idea is to re-enter data into a new Protege document and save as both .owl and .obo step by step as building the ontology (using the correct ID preference option). There may still be a problem with the external PURLs to resolve (@mah11 made an edit to my previous file to prevent PURLs being release-specific).

I welcome any feedback.

kimrutherford commented 6 years ago

I welcome any feedback.

James and I had a chat about this. Unfortunately we didn't come up with a solution but I think there's a few things we can try. I suggested we use OWLTools on the command line to attempt to convert your current OWL file to OBO format to see if that works. I'll try that on Monday with the OWL file you have on GitHub.

ValWood commented 6 years ago

Please give me input into the best structure of term names e.g. duplication of words such as 'normal' and 'phenotype'

@mah11 is probably best to advise. I would omit "phenotype" unless it is necessary (mainly for the higher level grouping terms). So when you are describing real observations like "normal post-penetration" you can omit "phenotype"

I would also exclude underscores from term names (@mah I see some things, like "subsets" and have underscores in FYPO - any wisdom about when and when not to use?)

Anyway, it looks like you have made good progress considering all of the hurdles getting used to the file formats, Git, ontology speak and Protege all at once! Once we can see the obo files we can give more feedback.

ValWood commented 6 years ago

Hi @pgaudet can I borrow you for a minute?

Have you come across the above error exporting obo files from Protege? and if so do you know what causes it? Or do you not create obo files from Protege in this way?

v

pgaudet commented 6 years ago

Hi @ValWood

I am not sure what you tried to do - looks like some namespaces were changed ? We can try to troubleshoot on Monday ?

ValWood commented 6 years ago

OK that is exactly what happened! Thanks for the clue. If we can't trouble-shoot locally we'll let you know. Have a good weekend!

mah11 commented 6 years ago

(Alayne) Please give me input into the best structure of term names e.g. duplication of words such as 'normal' and 'phenotype'

(Val) I would omit "phenotype" unless it is necessary (mainly for the higher level grouping terms). So when you are describing real observations like "normal post-penetration" you can omit "phenotype"

I agree, it's certainly not necessary for every term name to include "phenotype". Just over 1% of FYPO terms have "phenotype" in their names.

Every normal phenotype does have to have "normal" in its name; same goes for abnormal. The words corresponding to PATO qualities (normal, abnormal, increased, etc.) are among the features that distinguish phenotypes from each other so they're essential.

That said, there may be individual cases that look like they're in a grey area; we're happy to try to advise on specific terms if any make you want help.

(Val) I would also exclude underscores from term names

Yes, I emphatically agree.

(I see some things, like "subsets" and have underscores in FYPO - any wisdom about when and when not to use?)

In FYPO I have followed GO's convention of using underscores in subset names, partly for consistency and partly because they work as IDs in some contexts. I don't know if anything would go horribly wrong if subset names had spaces instead.

Aside; skip this if it looks boring: The names of the three GO "root" terms have underscores for historical reasons, and they haven't been removed because it would break too many pipelines, scripts, etc. FYPO happens to have a one-word name for its root, so the issue doesn't arise. For PHIPO I suggest removing underscores from all the term names, even the roots, and only add any back if something breaks.

tl;dr:

mah11 commented 6 years ago

One more point, not directly related to @CuzickA's questions:

I strongly recommend changing "chemistry" to "chemical" globally. The phenotypes aren't about sensitivity/resistance to the science that studies matter ;)

kimrutherford commented 6 years ago

I've reported the exception: https://github.com/ontodev/robot/issues/289

jseager7 commented 6 years ago

Thanks for that, @kimrutherford.

Just to summarise, now that one of the developers of ROBOT has responded: the exception is related to "a conflict between the obo_id annotation in OWL and the IRI of the OWL entity being converted". It's a bug in the OBO2OWL conversion, which affects Protege and the OWL API, so it's not a fault of our own. It seems like ROBOT can be used to help us track down any term that causes the exception, in case this bug happens to show up again.

CuzickA commented 6 years ago

Thanks for all the input. I have created a new protege file with different preferences eg auto-generate term id following ‘GO ontology editors guide’. I have saved and pushed phipo.owl and phipo.obo to GitHub. The file is still very basic and I haven't filled out all the term annotations yet. Here is a current snap shot phipo 08052018 capture

Next step: to decide how to handle external ontology terms for the logical definitions

jseager7 commented 6 years ago

With regards to handling external ontology terms, I've been trying to set up a copy of the Ontology Starter Kit, which should (in theory) allow us to restrict the external ontology imports to a subset of terms that are used for logical definitions.

Admittedly the OSK is supposed to be used with a new ontology, but according to the project readme, it should be possible to adapt our existing ontology to follow the OSK structure.

jseager7 commented 6 years ago

Closing this, since it was fixed a long time ago. Canto is able to use phipo.obo from this repository now.