jhu-idc / iDC-general

Contains non-code-base specific tickets relating to the Islandora8 for Digital Collection project
0 stars 0 forks source link

Full text search result returns only the matching pages, not the associated paged-content item #481

Closed htpvu closed 2 years ago

htpvu commented 2 years ago

reference test case 13.1 https://docs.google.com/document/d/1ifMPX88Dj3O04TR9xaPbupCSCmSmxjwxmEpIZGWKWig/edit

jabrah commented 2 years ago

When doing a search, we want to match at the Paged Content item level, not the Page level. We'll need to find a way to associate Page metadata and media such as transcriptions and OCR with the Page's parent Paged Content item.

Most likely way of doing this will be through the use of Drupal contexts (/admin/structure/context). We might be able to append current context for things like PDF Derivatives to associate the desired derivatives with the parent Page's parent Paged Content item.

Also may want to see if we can collapse child Page items into the indexed data for the Paged Content item.

jabrah commented 2 years ago

May be able to wrap up this ticket in this one: https://github.com/jhu-idc/iDC-general/issues/461

jabrah commented 2 years ago

Maybe try a custom process for Solr, defining a custom field that knows how to walk the node hierarchy to get the Paged Content parent of the Page? https://www.drupal.org/docs/8/modules/search-api/developer-documentation/create-custom-fields-using-a-custom-processor