NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
42 stars 27 forks source link

invalid resource map generated in the midst of connection issues #893

Open jeanetteclark opened 5 years ago

jeanetteclark commented 5 years ago

During the UCSB network outage this morning I was in the midst of updating a data package using the web editor (added two data files). I pushed submit and noticed that the resource map didn't index right away. Shortly after that I was unable to connect to the KNB at all (due to the outage).

When we got back online and the resource map had still not indexed, @amoeba found this in the logs:

metacat-index 20190206-15:21:34: [ERROR]: SolrIndex.update - could not update the solr index for the object urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc since org.dspace.foresite.OREParserException: org.dspace.foresite.OREException: No Identifer statement was found for the resourceMap resource ('https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc') [edu.ucsb.nceas.metacat.index.SolrIndex:update:604]

indicating that an invalid resource map was generated.

pid is: urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc

amoeba commented 5 years ago

Here's the ORE in ttl format:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc>
    <http://www.openarchives.org/ore/terms/describes> <https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc#aggregation> .

<https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc#aggregation>
    <http://www.openarchives.org/ore/terms/isDescribedBy> <https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc> .

<https://cn.dataone.org/cn/v1/resolve/urn:uuid:258c3c2d-8d11-48f6-b2ca-977c42bab59e>
    <http://purl.org/dc/elements/1.1/creator> [
        a <http://purl.org/dc/terms/Agent> ;
        <http://xmlns.com/foaf/0.1/name> "DataONE Java Client Library"
    ] ;
    <http://purl.org/dc/terms/identifier> "urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc" ;
    <http://purl.org/dc/terms/modified> "2019-02-06T15:19:12.536-08:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
    a <http://www.openarchives.org/ore/terms/ResourceMap> .

<https://cn.dataone.org/cn/v1/resolve/urn:uuid:258c3c2d-8d11-48f6-b2ca-977c42bab59e#aggregation>
    <http://purl.org/dc/elements/1.1/title> "DataONE Aggregation" ;
    a <http://www.openarchives.org/ore/terms/Aggregation> .

<https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3A28a62efe-84bc-4708-a27c-65f3e7338780>
    a <http://purl.dataone.org/provone/2015/01/15/ontology#Data> ;
    <http://www.w3.org/ns/prov#wasDerivedFrom> <https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Ac2b8f113-625e-4ac5-9c6f-9dae43bfe666>, <https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Ad961b364-0e6b-4b92-a319-f5bc02e3502a> .

<https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Aa7246100-ba6b-4f39-8b32-0f58f398138a>
    a <http://purl.dataone.org/provone/2015/01/15/ontology#Data> ;
    <http://www.w3.org/ns/prov#wasDerivedFrom> <https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Ac2b8f113-625e-4ac5-9c6f-9dae43bfe666> .

<https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Ac2b8f113-625e-4ac5-9c6f-9dae43bfe666>
    a <http://purl.dataone.org/provone/2015/01/15/ontology#Data> .

<https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Ad961b364-0e6b-4b92-a319-f5bc02e3502a>
    a <http://purl.dataone.org/provone/2015/01/15/ontology#Data> .

The ORE above shows some odd things:

grepping catalina.out for the resource map PID, I see:

metacat-index 20190206-15:21:34: [ERROR]: SolrIndex.update - could not update the solr index for the object urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc since org.dspace.foresite.OREParserException: org.dspace.foresite.OREException: No Identifer statement was found for the resourceMap resource ('https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc') [edu.ucsb.nceas.metacat.index.SolrIndex:update:604]
metacat-index 20190206-21:34:16: [ERROR]: SolrIndex.update - could not update the solr index for the object urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc since org.dspace.foresite.OREParserException: org.dspace.foresite.OREException: No Identifer statement was found for the resourceMap resource ('https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc') [edu.ucsb.nceas.metacat.index.SolrIndex:update:604]
metacat-index 20190207-00:22:34: [ERROR]: SolrIndex.update - could not update the solr index for the object urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc since org.dspace.foresite.OREParserException: org.dspace.foresite.OREException: No Identifer statement was found for the resourceMap resource ('https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc') [edu.ucsb.nceas.metacat.index.SolrIndex:update:604]
metacat-index 20190207-21:34:26: [ERROR]: SolrIndex.update - could not update the solr index for the object urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc since org.dspace.foresite.OREParserException: org.dspace.foresite.OREException: No Identifer statement was found for the resourceMap resource ('https://cn.dataone.org/cn/v1/resolve/urn%3Auuid%3A06f21a3c-6676-4bce-bb6c-693e2f4e87fc') [edu.ucsb.nceas.metacat.index.SolrIndex:update:604]
metacat 20190208-08:35:33: [ERROR]: D1ResourceHandler: Serializing exception with code 400: The previous identifier has already been made obsolete by: urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc [edu.ucsb.nceas.metacat.restservice.D1ResourceHandler:serializeException:536]
org.dataone.service.exceptions.InvalidRequest: The previous identifier has already been made obsolete by: urn:uuid:06f21a3c-6676-4bce-bb6c-693e2f4e87fc

So I interpret this all to mean that the Data Package ORE was invalid and therefore not indexed.

jeanetteclark commented 5 years ago

I just managed to recreate this bug again by updating the package again after fixing the resource map. So network issues appear to be unrelated