ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

EuroSciVoc: REST API fails to calculate root classes for this SKOS-format ontology #246

Closed jvendetti closed 2 years ago

jvendetti commented 2 years ago

We received a report from Xeni (OntoPortal Alliance member) that EcoPortal shows a "Problem retrieving classes" error when trying to view the classes for their EUROSCIVOC SKOS ontology. You can see the behavior on the EcoPortal site here:

http://ecoportal.lifewatch.eu/ontologies/EUROSCIVOC/?p=classes&conceptid=root

(EcoPortal is an instance of the OntoPortal virtual appliance)

Xeni sent me an RDF file so that we could try to reproduce this on our end. I created an entry on our staging server with the ontology here:

https://stage.bioontology.org/ontologies/EUROSCIVOCAPEU/?p=summary

... and we see the same "Problem retrieving classes" error on the classes page:

https://stage.bioontology.org/ontologies/EUROSCIVOCAPEU/?p=classes&conceptid=root

The underlying reason for this error is that the REST API is failing to calculate the set of root classes, i.e., the /roots endpoint is returning an empty set:

https://stagedata.bioontology.org/ontologies/EUROSCIVOCAPEU/classes/roots

I've looked at the content of the ontology, and it appears to meet our minimum standards for SKOS support, which are detailed on our wiki:

https://www.bioontology.org/wiki/SKOSSupport

In particular, the SKOS ontology contains the necessary ConceptScheme with a list of hasTopConcept declarations:

...
<rdf:Description rdf:about="http://data.europa.eu/8mn/euroscivoc/40c0f173-baa3-48a3-9fe6-d6e8fb366a00">
    <startDate xmlns="http://publications.europa.eu/ontology/euvoc#" rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2019-12-02</startDate>
    <created xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2019-12-02</created>
    <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/>
    <versionInfo xmlns="http://www.w3.org/2002/07/owl#">1.3</versionInfo>
    <prefLabel xmlns="http://www.w3.org/2004/02/skos/core#" xml:lang="en">EuroSciVoc</prefLabel>
    <prefLabel xmlns="http://www.w3.org/2008/05/skos-xl#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/1723cd23-0e0e-4092-9709-49f42c5fb100"/>
    <hasTopconcept xmlns="http://www.w3.org/2004/02/skos/core#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/12c980a5-034f-4cd5-8ce7-fcfe109c675c"/>
    <hasTopconcept xmlns="http://www.w3.org/2004/02/skos/core#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/64605fff-1946-4fd4-b021-e2e83b71dcac"/>
    <hasTopconcept xmlns="http://www.w3.org/2004/02/skos/core#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/7a2dcf7b-2c20-468a-a81c-76d67c14de31"/>
    <hasTopconcept xmlns="http://www.w3.org/2004/02/skos/core#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/9b9abbee-82f5-4766-bd4d-feb73cf06573"/>
    <hasTopconcept xmlns="http://www.w3.org/2004/02/skos/core#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/9e8e1abb-1a14-40ef-9658-782092fa5cf6"/>
    <hasTopconcept xmlns="http://www.w3.org/2004/02/skos/core#" rdf:resource="http://data.europa.eu/8mn/euroscivoc/3d76a7f4-5a16-411e-ae44-7b712d5222ee"/>
    <identifier xmlns="http://purl.org/dc/terms/">http://data.europa.eu/8mn/euroscivoc/40c0f173-baa3-48a3-9fe6-d6e8fb366a00</identifier>
</rdf:Description>
...

Based on the snippet above, I would expect there to be 6 root classes returned by the REST API. Note that Protege correctly recognizes the ConceptScheme with associated hasTopConcepts:

Screen Shot 2022-05-12 at 1 13 41 PM

jvendetti commented 2 years ago

From @graybeal:

I saw this once when there was a recursion in the hierarchy (A broaderThan B broaderThan A), if I recall correctly. Another possibility to consider.

jvendetti commented 2 years ago

Example unit test code that can be placed in test_ontology_submission.rb to debug this issue:

def test_euroscivoc_roots
  submission_parse("EUROSCIVOC_TST", "EuroSciVoc Test",
                   "./test/data/ontology_files/EuroSciVoc-skos-ap-eu.rdf", 1,
                   process_rdf: true, index_search: false, run_metrics: false)
  sub = LinkedData::Models::OntologySubmission.where(ontology: [acronym: "EUROSCIVOC_TST"], submissionId: 1)
                                              .include(:version).first
  roots = sub.roots
  refute_empty roots, 'Failed to return root classes for EuroSciVoc'
end

Example SPARQL query to retrieve root classes:

SELECT DISTINCT ?root WHERE {
GRAPH <http://data.bioontology.org/ontologies/EUROSCIVOC_TST/submissions/1> {
  ?x <http://www.w3.org/2004/02/skos/core#hasTopConcept> ?root .
}}
syphax-bouazzouni commented 2 years ago

Hello, If I can give you my modest help with this. I think the issue is just a miswriting of the skos hasTopConcept property.

In the resource file, it is written hasTopconcept but the correct writing is hasTopConcept

jvendetti commented 2 years ago

OMG, a single character in the wrong case.

😫

Thank you for pointing that out Syphax - your eyesight is clearly better than mine.

I modified the hasTopConcept declarations to use the correct case and uploaded a new version of the ontology to our staging environment. I can confirm that this fixes the issue.

Screen Shot 2022-05-26 at 6 07 26 AM

xeniacs commented 2 years ago

Hi, thank you so much for looking into this. I made the change in EcoPortal as well - thanks Syphax! I did this on Thursday, as soon as I read abt the typo. After correcting, for some time on Thursday already I could access the Classes without being logged in, but the problem persisted when logged in as admin. I thought it might be a cache issue. Come Monday, I cannot access the classes either as admin or as unsubscribed user. I checked the SKOS file and it is the corrected version, so there was no problem with the update. Any tip on what to check in our instance? Thanks heaps

jvendetti commented 2 years ago

@xeniacs - was there a newer version of the ontology that was uploaded over the weekend?

Are you able to access your ontology classes and roots in a web browser via the REST API, e.g.:

http://{ip_address_of_appliance}:8080/ontologies/EUROSCIVOC/classes http://{ip_address_of_appliance}:8080/ontologies/EUROSCIVOC/classes/roots

You mentioned that you thought it might be a caching issue. Just to clarify, did you clear all of the application caches on the Admin -> Site Administration tab in your web UI?