NCEAS / metadig-engine

MetaDig Engine: multi-dialect metadata assessment engine
7 stars 5 forks source link

Is the Solr endpoint still needed? #284

Open gothub opened 3 years ago

gothub commented 3 years ago

@mbjones @laurenwalker @csjx

The metadata assessment Solr endpoint has not been available outside of the NCEAS domain for about 2 years now (via ufw rules to exclude outside access).

We had made the decision to discontinue efforts to have MetacatUI query this endpoint to aggregate assessment info to create D3 based graphics inside MetacatUI, due to data volume and resulting lag time between initiating a query and displaying a graphic.

Will we need to use Solr in the future to retrieve and aggregate assessment scores?

For reference, here is an example Solr document:

        "metadataId":"doi:10.18739/A2ZC8P",
        "formatId":"https://nceas.ucsb.edu/mdqe/v1",
        "runId":"d3405650-cff6-481f-9f0d-95e62e5f7f56",
        "suiteId":"arctic.data.center.suite.1",
        "timestamp":"2019-09-16T23:23:46.148Z",
        "datasource":"urn:node:ARCTIC",
        "metadataFormatId":"eml://ecoinformatics.org/eml-2.1.1",
        "dateUploaded":"2016-04-02T08:22:35.117Z",
        "obsoletes":"urn:uuid:ef9a28cd-c3c6-40ec-b05a-a7e2b57ad2b5",
        "sequenceId":"urn:uuid:5bd73e7a-c875-429b-8ad3-060e42079ccc",
        "funder":["1107792"],
        "funderInfo":["Collaborative Research: Toward a Circumarctic Lakes Observation Network (CALON) (NSF 1107792)",
          "Collaborative Research: Toward a Circumarctic Lakes Observation Network (CALON)",
          "NSF",
          "1107792",
          "AON-Arctic Observing Network",
          "040100 NSF RESEARCH & RELATED ACTIVIT"],
        "rightsHolder":"CN=arctic-data-admins,DC=dataone,DC=org",
        "group":["CN=arctic-data-admins,DC=dataone,DC=org"],
        "checksPassed":14,
        "checksWarned":3,
        "checksFailed":1,
        "checksInfo":9,
        "checkCount":26,
        "scoreOverall":0.93333334,
        "scoreByType_identification_f":1.0,
        "scoreByType_interpretation_f":0.5,
        "scoreByType_discovery_f":1.0,
        "_version_":1644876243614564352,
        "isLatest":true},
mbjones commented 3 years ago

Possibly. We still need an API to download run-level scores and check-level scores, rather than individual reports. Let's talk about how the service will deliver both data for various collections and the graphics that go along with that.