LD4P / qa_server

A rails engine with questioning authority gem installed to serve as an authority search server with normalized results.
Apache License 2.0
6 stars 3 forks source link

Add Subauthorities to DAVE and QA for LCGFT #50

Closed sfolsom closed 4 years ago

sfolsom commented 5 years ago

The conclusion if a conversation with @elrayle and others during the Cornell meeting today is that we need subauthorities based on the spreadsheet for LCGFT column A, https://docs.google.com/spreadsheets/d/1rPvEoP9iYNkxJ0eWC8gXe3ci7e6mhW0da59xkGhadi0/edit?usp=sharing.

elrayle commented 5 years ago

@eichmann I currently have the following subauthoritities configured for LCGFT. I do not believe the data is actually subset on these, so I think we can remove them.

    "subauthorities": {
      "person":         "Person",
      "organization":   "Organization",
      "place":          "Place",
      "intangible":     "Intangible",
      "geocoordinates": "GeoCoordinates",
      "work":           "Work"
    }

Instead, I believe the subauthorities should be

    "subauthorities": {
      "active":       "active"
      "deprecated":   "deprecated"
    }
eichmann commented 5 years ago

The active instances (really new and revised) have authoritativeLabels. The deprecated don't - apparently only have variantLabels... Suggestions?

eichmann commented 5 years ago

Also, would you like to handle the subauthority selection via a request parameter or have each with a separate service?

sfolsom commented 5 years ago

@eichmann The variant labels on the deprecated authority should bring up the deprecated authority along with the deletion note, and along with use instead pointers to the preferred authorities. See: http://id.loc.gov/authorities/genreForms/gf2011026651.html as an example. The spreadsheet has been updated to describe this: https://docs.google.com/spreadsheets/d/1rPvEoP9iYNkxJ0eWC8gXe3ci7e6mhW0da59xkGhadi0/edit#gid=696491968

elrayle commented 5 years ago

@eichmann @sfolsom I am going to summarize what I understand to be what we need here based on discussions at the Cornell meeting this morning.

There should be a single lookup & batch URL that searches across all search fields listed under the search column in the spreadsheet. The search results will include active and deprecated entities that match the query based on index weighting as follows...

Pluses (+) indicate relative weighting in the index.

The graph returned with the search results should include everything needed for extended context. The extended context should include all the fields listed under the Display column in the spreadsheet.

elrayle commented 5 years ago

Expected workflow...

elrayle commented 5 years ago

PR https://github.com/cul-it/qa_server/pull/81 updates QA configuration to match spreadsheet

elrayle commented 5 years ago

ACTION

Context & Queries

The following table identifies the extended context that should be added and queries that produce results with values for those context fields.

extended context query (result idx) expected value comments
Preferred label = skos:prefLabel animation (0) Cutout animation films
Alternative label = skos:altLabel animation (0) [Collage animation films, ...]
Scope note = skos:note Animated films (1) Films that create the illusion of movement in drawings, clay, ...
Citation note = madsrdf:hasSource / madsrdf:citation-note Cartographic materials ["(cartographic materials: all materials that represent the whole or part of the Earth or any celestial body; includes two- and three-dimensional maps and plans (including maps of imaginary places); aeronautical, nautical, and celestial charts; atlases, globes; block diagrams; sections; aerial photographs with a cartographic purpose; birds-eye views (map views); etc.)"@en]
Citation source = madsrdf:hasSource / madsrdf:citation-source Cartographic materials ["Anglo-American cataloguing rules, 2002:"]
Citation status = madsrdf:hasSource / madsrdf:citation-status Cartographic materials ["found"]
Broader = skos:broader / skos:prefLabel animation (0) [http://id.loc.gov/authorities/genreForms/gf2011026049]
Narrower = skos:narrower / skos:prefLabel Animated films (1) [http://id.loc.gov/authorities/genreForms/gf2011026107, ...]
rdf:type animation (0) [http://www.loc.gov/mads/rdf/v1#GenreForm, ...]
Label = madsrdf:variantLabel Basketball television programs ["Basketball television programs"] Only used for deprecated terms
Use instead = madsrdf:useInstead Basketball television programs [http://id.loc.gov/authorities/genreForms/gf2011026603] Only used for deprecated terms
Use instead = madsrdf:useInstead / skos:prefLabel Basketball television programs ["Sports television programs"] Only used for deprecated terms
Deletion note = madsrdf:deletionNote Basketball television programs ["This authority record has been deleted because the term is covered by the genre/form term {Sports television programs} (gf2011026603)."@en] Only used for deprecated terms
elrayle commented 5 years ago

Question:

sfolsom commented 5 years ago

ACTION

  • [x] SPARQL query in cache updated to return context (@eichmann)
  • [x] QA configured to return extended context (@elrayle)
  • [x] Identify queries that produce results with values for context fields (@sfolsom)
  • [x] Run test queries in QA correctly extract out the extended context (@elrayle)
  • [x] Push to production (@elrayle)

Context & Queries

The following table identifies the extended context that should be added and queries that produce results with values for those context fields.

extended context query (result idx) expected value comments Preferred label = skos:prefLabel animation (0) Cutout animation films
Alternative label = skos:altLabel animation (0) [Collage animation films, ...]
Scope note = skos:note Animated films (1) Films that create the illusion of movement in drawings, clay, ...
Citation note = madsrdf:hasSource / madsrdf:citation-note Cartographic materials ["(cartographic materials: all materials that represent the whole or part of the Earth or any celestial body; includes two- and three-dimensional maps and plans (including maps of imaginary places); aeronautical, nautical, and celestial charts; atlases, globes; block diagrams; sections; aerial photographs with a cartographic purpose; birds-eye views (map views); etc.)"@en] Citation source = madsrdf:hasSource / madsrdf:citation-source Cartographic materials ["Anglo-American cataloguing rules, 2002:"] Citation status = madsrdf:hasSource / madsrdf:citation-status Cartographic materials ["found"]
Broader = skos:broader / skos:prefLabel animation (0) [http://id.loc.gov/authorities/genreForms/gf2011026049] Narrower = skos:narrower / skos:prefLabel Animated films (1) [http://id.loc.gov/authorities/genreForms/gf2011026107, ...]
rdf:type animation (0) [http://www.loc.gov/mads/rdf/v1#GenreForm, ...] Label = madsrdf:variantLabel Basketball television programs ["Basketball television programs"] Only used for deprecated terms Use instead = madsrdf:useInstead Basketball television programs [http://id.loc.gov/authorities/genreForms/gf2011026603] Only used for deprecated terms Use instead = madsrdf:useInstead / skos:prefLabel Basketball television programs ["Sports television programs"] Only used for deprecated terms Deletion note = madsrdf:deletionNote Basketball television programs ["This authority record has been deleted because the term is covered by the genre/form term {Sports television programs} (gf2011026603)."@en] Only used for deprecated terms

I've added missing queries.

sfolsom commented 5 years ago

Question:

  • Since context is different for deprecated and active, should they be defined as separate authorities?

Because I'm having trouble coming up with specifics about how one config file might cause undesired data coming back, I say we go with one file and see if we want to revisit the decision later.

elrayle commented 5 years ago

Remaining Work

active vs. deprecated subauths

There are two subauths. In the URL to the cache, they are identified by entity=deprecated | active. In the cache, they are searched by creating a query such that...

deprecated =

active = ! deprecated

@eichmann is updating the queries to process the entity parameter.

update query for extended context

This was one of the first auths that included extended context. @eichmann is rewriting the queries for LCGFT to match the approach taken authorities where extended context was added more recently.

Will need to re-verify the extended context once complete.

eichmann commented 4 years ago

Clicked the wrong button!

eichmann commented 4 years ago

Fix deployed to production.

elrayle commented 4 years ago

This work is complete and in production.