Closed htpvu closed 2 years ago
When doing a search, we want to match at the Paged Content item level, not the Page level. We'll need to find a way to associate Page metadata and media such as transcriptions and OCR with the Page's parent Paged Content item.
Most likely way of doing this will be through the use of Drupal contexts (/admin/structure/context
). We might be able to append current context for things like PDF Derivatives to associate the desired derivatives with the parent Page's parent Paged Content item.
Also may want to see if we can collapse child Page items into the indexed data for the Paged Content item.
May be able to wrap up this ticket in this one: https://github.com/jhu-idc/iDC-general/issues/461
Maybe try a custom process for Solr, defining a custom field that knows how to walk the node hierarchy to get the Paged Content parent of the Page? https://www.drupal.org/docs/8/modules/search-api/developer-documentation/create-custom-fields-using-a-custom-processor
reference test case 13.1 https://docs.google.com/document/d/1ifMPX88Dj3O04TR9xaPbupCSCmSmxjwxmEpIZGWKWig/edit