drdozer / oboformat

Automatically exported from code.google.com/p/oboformat
0 stars 0 forks source link

Conversion of IDs that are not cannoical OBO foundry format #37

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Add support for conversion of non-cannonical IDs as outlined in 
http://berkeleybop.org/~cjm/obo2owl/obo-syntax.html#5.9

section 5.9.2

Note - cannonical IDs are defined in 

http://berkeleybop.org/~cjm/obo2owl/obo-syntax.html#2.5

Background:

If this tool is to be used for general OBO to OWL translation and 
roundtripping, then it needs to be able to cope with IDs that are not in OBO 
foundry format. It currently fails to do this for IDs containing an underscore. 
Round tripping these IDs leads to truncation of the ID before the first 
underscore, meaning that IDs that are identical up to this point get merged.

Original issue reported on code.google.com by dosu...@gmail.com on 20 Jun 2011 at 3:02

GoogleCodeExporter commented 9 years ago
Note, fixing this should also fix: 
http://code.google.com/p/oboformat/issues/detail?id=35

Original comment by dosu...@gmail.com on 20 Jun 2011 at 3:04

GoogleCodeExporter commented 9 years ago
Almost fixed.  Here's my latest test:

--------
ontology: test

[Typedef]
id: part_of
name: part_of

[Typedef]
id: OBO_REL:part_of
name: part of

-----
> OBO2OWL >
-----
    <owl:ObjectProperty rdf:about="http://purl.obolibrary.org/obo/test#part_of">
    <owl:ObjectProperty rdf:about="http://purl.obolibrary.org/obo/OBO_REL#_part_of">
-----
> OWL2OBO
-----
ontology: test

[Typedef]
id: OBO_REL:part_of
name: part of

[Typedef]
id: http://purl.obolibrary.org/obo/test#part_of
name: part_of
----

s/\#\_/\:/ is correct

But # for hash on its own, just remove # and all preceding going from OWL to 
OBO.

Original comment by dosu...@gmail.com on 8 Jul 2011 at 3:02

GoogleCodeExporter commented 9 years ago
The # is removed from OWL 2 OBO translation.

Original comment by shahid.m...@gmail.com on 8 Jul 2011 at 6:37

GoogleCodeExporter commented 9 years ago

Original comment by shahid.m...@gmail.com on 8 Jul 2011 at 6:37

GoogleCodeExporter commented 9 years ago
Same test with -r147 =>

----
ontology: test

[Typedef]
id: OBO_REL:part_of
name: part of

[Typedef]
id: test:part_of
name: part_of
----

i.e.- the ID part_of is still not roundtripping.  Instead it becomes 
test:part_of.

Original comment by dosu...@gmail.com on 11 Jul 2011 at 9:50

GoogleCodeExporter commented 9 years ago
My last changes were not committed in repository. Can you check it now?

Original comment by shahid.m...@gmail.com on 11 Jul 2011 at 3:05

GoogleCodeExporter commented 9 years ago
Results still as Comment 5.  Can you just use attached as your unit test?  
After roundtrip, should be identical to starting file.

Original comment by dosu...@gmail.com on 11 Jul 2011 at 3:28

Attachments:

GoogleCodeExporter commented 9 years ago
not tickled about the default prefix - where does 
http://purl.obolibrary.org/obo/test/ come from. "test" might be a reasonable 
namespace. Moreover if two people use ontology:test, there could be URL clash. 
I think either insist on a prefix definition or append a guid for the ontology 
name.

Original comment by alanruttenberg@gmail.com on 11 Jul 2011 at 8:18

GoogleCodeExporter commented 9 years ago
Irrelevant. Simply comes from the ontology name specified in the first line of 
the test file.  I could as well have called it fubar.

Original comment by dosu...@gmail.com on 11 Jul 2011 at 8:53

GoogleCodeExporter commented 9 years ago
relevant. The namespaces below purl.obolibrary.org/obo/ is reserved for 
ontologies that have requested a namespace.

If you want something like this (which you shouldn't - there's no real 
difficulty in forcing there to be a fully qualified URI for the name) then 
reserve something like 

http://purl.obolibrary.org/obo/unregistered/ as the prefix, or make the prefix 
be a urn: or file: 

i.e. don't crap up semweb space please

Original comment by alanruttenberg@gmail.com on 13 Jul 2011 at 7:17

GoogleCodeExporter commented 9 years ago
The problem is that the obolib-obo2owl takes the line

ontology: blah
and rolls and ontology with the URI
http://purl.obolibrary.org/obo/blah.owl
+ may use http://purl.obolibrary.org/obo/blah as part of a term ID URI under 
some circumstances.

We need to be able to make toy test obo files for testing purposes. AFAIK there 
is currently no way to do this without generating the URIs you are objecting to.

So, if you want these tests to generate a URI that doesn't potentially mess 
with the foundry's reserved bit of semweb space, then either we need to reserve 
'unregistered' for test purposes or we need a mechanism built into the obolib 
code.  Please can you specify either a reserved URI space under 
http://purl.obolibrary.org/obo/ for testing purposes or specify some system + a 
feature request for specifying test ontology URIS.  Once I have an option, I 
will happily avoid generating URIs you don't like.  Until then, I'm afraid I 
don't see I have an option if I want to continue the productive cycle of 
testing and bug fixes we've achieved in this project.

Original comment by dosu...@gmail.com on 13 Jul 2011 at 10:38

GoogleCodeExporter commented 9 years ago
I gave two suggestions: But to be specific until there is the mechanism to 
specify the base URI (thought there was already but will check) use  
http://purl.org/NET/obo/<name>.owl

I will review and submit a feature request later, if necessary.

If feeling like being a bit safer, use a generated guid between "/" and 
"<name>".

Original comment by alanruttenberg@gmail.com on 13 Jul 2011 at 12:06

GoogleCodeExporter commented 9 years ago
"use http://purl.org/NET/obo/<name>.owl"

Need to be able to specify a base URI to do this.  

Tried

ontology: http://purl.org/NET/obo/f882e2ac-886f-4478-b276-621eb7ca47e2

But the OWL is screwy in places:

     xmlns:f882e2ac-886f-4478-b276-621eb7ca47e2="http://purl.obolibrary.org/obo/http://purl.org/NET/obo/f882e2ac-886f-4478-b276-621eb7ca47e2#"

Using

ontology: f882e2ac-886f-4478-b276-621eb7ca47e2

(-> http://purl.obolibrary.org/obo/f882e2ac-886f-4478-b276-621eb7ca47e2 )

Works fine though.  Surely this is safe enough.

Original comment by dosu...@gmail.com on 13 Jul 2011 at 12:30

GoogleCodeExporter commented 9 years ago
it is safe enough but misleading. Please either make it
1) 
http://purl.obolibrary.org/obo/unregistered/f882e2ac-886f-4478-b276-621eb7ca47e2
 but then include an annotation on the ontology explaining that the URI is not 
intended to be resolved.
2) Use tag urns: http://www.taguri.org/ These can easily made unique and don't 
come with the implication that the URI is resolvable.

Original comment by alanruttenberg@gmail.com on 15 Jul 2011 at 8:55

GoogleCodeExporter commented 9 years ago
Can we keep irrelevant discussions off the tracker? Thanks.

This now roundtrips, so this issue is fixed (see also the RoundTrip junit tests)
---
ontology: test

[Typedef]
id: part_of
name: part_of

[Typedef]
id: OBO_REL:part_of
name: part of
--

Original comment by cmung...@gmail.com on 15 Jul 2011 at 10:07

GoogleCodeExporter commented 9 years ago
"Can we keep irrelevant discussions off the tracker? "

??

Which discussion did you consider irrelevant? 

Original comment by alanruttenberg@gmail.com on 18 Jul 2011 at 11:48

GoogleCodeExporter commented 9 years ago
I think Chris means - not relevant to the subject of the ticket.  A discussion 
of what IDs to use for test ontologies is of general relevance, just not (or 
not directly) here.  

BTW - this seems to work:

ontology: unregistered/f882e2ac-886f-4478-b276-621eb7ca47e2

>
http://purl.obolibrary.org/obo/unregistered/f882e2ac-886f-4478-b276-621eb7ca47e2
.owl

although after roundtrip
=>
ontology: f882e2ac-886f-4478-b276-621eb7ca47e2

Original comment by dosu...@gmail.com on 19 Jul 2011 at 10:08

GoogleCodeExporter commented 9 years ago
The issue is not what to do with test ontologies but how to construct URIs in 
the OBO to OWL conversion so that all valid OBO cases are covered, and none 
lead to violations of semweb specifications.

Original comment by alanruttenberg@gmail.com on 20 Jul 2011 at 12:35

GoogleCodeExporter commented 9 years ago
- Comment 18  by project member alanruttenberg, Today (8 hours ago)

> The issue is not what to do with test ontologies but how to construct URIs in 
the OBO to OWL conversion so that all valid OBO cases are covered, and none 
lead to violations of semweb specifications.

There's no accounting for what people might choose to use as a value for the 
OBO header tag 'ontology'.  

The problem is that the converter takes any value in this field and uses it to

(a) roll a URI for the whole ontology using the base URI 
'http://purl.obolibrary.org/obo/'

(b) roll a URI for  any non-canonical OBO IDs using the base URI 
'http://purl.obolibrary.org/obo/'.

It looks to me like the solution is:
(a) Have the conversion code read a list of registered URIs with the base 
http://purl.obolibrary.org/obo/' and use some other URI scheme for unregistered 
URIs.  (Having the converter only work for registered ones would be too strict]

(b) Always use some other URI scheme for non-canonical OBO IDs.

Original comment by dosu...@gmail.com on 20 Jul 2011 at 9:32