OBOFoundry / COB

An experimental ontology containing key terms from Open Biological and Biomedical Ontologies (OBO)
https://obofoundry.github.io/COB
Creative Commons Zero v1.0 Universal
39 stars 8 forks source link

Query for taxa used in OBO #60

Open jamesaoverton opened 4 years ago

jamesaoverton commented 4 years ago

It would be good to know what taxa are used in OBO, and have a script to build a little tree of them.

I used the Ontobee SPARQL endpoint http://sparql.hegroup.org/sparql/ to run variations on this query:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix owl: <http://www.w3.org/2002/07/owl#>
SELECT distinct ?graph_iri, ?taxon, ?label
WHERE { 
  GRAPH ?graph_uri {
    ?s owl:onProperty <http://purl.obolibrary.org/obo/RO_0002162> ; # in taxon
       owl:someValuesFrom ?taxon . # some X
    ?taxon rdfs:label ?label .
  }
}

There aren't that many "'in taxon' some X" results. There are more results for "'only in taxon' some X". The vast majority of these are from PR, and the majority of those from viruses and bacteria. There are about 700 distinct taxa with PR, and about 50 without PR. That list of 50 is interesting.

Rough analysis here:

https://docs.google.com/spreadsheets/d/16D7l0G-DL1Liv7yYFYVBEgRNpuNepCQCZoQTLYXXorA

balhoff commented 4 years ago

Why not include only in taxon usages in the list? There are taxa used in GO that aren't in your current list.

bpeters42 commented 4 years ago

Jim: They are on a different tab

On Thu, Mar 19, 2020 at 6:52 AM Jim Balhoff notifications@github.com wrote:

Why not include only in taxon usages in the list? There are taxa used in GO that aren't in your current list.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/COB/issues/60#issuecomment-601190617, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IQIQXA3NGWMTQH3W33RIIPRXANCNFSM4LPJIVTA .

-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

jamesaoverton commented 4 years ago

I did check for 'only in taxon' using a variation on that query, with results in the Google Sheet, but you're right that I somehow missed GO. I'll make sure I get GO when I run this again.

This was quick-and-dirty, but I was asked to follow up later. So I made this issue mostly to remind myself.

bpeters42 commented 4 years ago

James: The 'unique' tab still has the issue that you are reporting taxa multiple times if they are used with different labels. For example: [image: image.png]

On Thu, Mar 19, 2020 at 6:56 AM James A. Overton notifications@github.com wrote:

I did check for 'only in taxon' using a variation on that query, with results in the Google Sheet, but you're right that I somehow missed GO. I'll make sure I get GO when I run this again.

This was quick-and-dirty, but I was asked to follow up later. So I made this issue mostly to remind myself.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/COB/issues/60#issuecomment-601192970, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2ITUO67EYNCU3A2Z7E3RIIQCFANCNFSM4LPJIVTA .

-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

jamesaoverton commented 4 years ago

You're right @bpeters42, I haven't updated the sheet since we discussed it. I'm just putting the task here, where I'll remember it.

balhoff commented 4 years ago

Jim: They are on a different tab

Thank you @bpeters42, I totally missed that.

jamesaoverton commented 4 years ago

I have a prototype here: