obophenotype / ncbitaxon

Build for NCBITaxon
BSD 3-Clause "New" or "Revised" License
27 stars 7 forks source link

Rank "forma specialis" is left as URL identifier in ncbitaxon.obo #60

Closed vmedea closed 1 year ago

vmedea commented 2 years ago

I don't know if it is intentional, but ran into this irregularity while parsing. Where all other ranks are represented as local terms NCBITaxon:superorder etc, there's one that is represented as URL throughout the obo file:

format-version: 1.2
data-version: 2021-12-14
⋮ 
[Term]
id: NCBITaxon:100902
name: Fusarium oxysporum f. sp. conglutinans
namespace: ncbi_taxonomy
alt_id: NCBITaxon:178564
synonym: "Fusarium oxysporum f. conglutinans" RELATED synonym []
xref: GC_ID:1
is_a: NCBITaxon:5507 ! Fusarium oxysporum
property_value: has_rank http://purl.obolibrary.org/obo/NCBITaxon_forma_specialis
⋮ 
cmungall commented 2 years ago

I suspect this is an owl2obo pecularity due to odd URL structures (specifically possible ambiguity of multiunderscores)

In fact at the OWL level they are are all URLs

✗ curl -L -s http://purl.obolibrary.org/obo/NCBITaxon_2 | grep superking                                                                                                                                          
        <ns3:has_rank rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_superkingdom"/>
    <!-- http://purl.obolibrary.org/obo/NCBITaxon_superkingdom -->
    <Class rdf:about="http://purl.obolibrary.org/obo/NCBITaxon_superkingdom">
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">superkingdom</rdfs:label>

While this should be fixed independently, I don't think it's a good idea to inject rank into the NCBITaxon namespace. These URLs resolve in neither OBO nor in NCBI itself.

@balhoff is https://obofoundry.org/ontology/taxrank still active?

Should we use it?

in order not to break existing code we could use a second assertion

http://purl.obolibrary.org/obo/TAXRANK_1000000

to taxrank

or maybe just use wikidata URIs?

jamesaoverton commented 2 years ago

I haven't had a chance to look into this, but I thought it was just a new rank that we hadn't seen before and should add to this list: https://github.com/obophenotype/ncbitaxon/blob/master/src/ncbitaxon.py#L45. All the other ranks have similar IRIs, and we define them like this: https://github.com/obophenotype/ncbitaxon/blob/master/src/ncbitaxon.py#L314.

balhoff commented 2 years ago

Taxrank is not really active, but we could add to it and I should be able to make a release. I think it would be fine to use it in the NCBI taxonomy product.