Open patdunlavey opened 4 days ago
From @DiegoPino on slack: Diego Pino 4:39 PM So there is one issue (why i avoided it...)
if (isset($allfields_translated_to_solr['parent_sequence_id']) &&
isset($extradata_from_item['search_api_solr_document'][$allfields_translated_to_solr['parent_sequence_id']]) && !$result_is_top_media) {
$sequence_number = (array) $extradata_from_item['search_api_solr_document'][$allfields_translated_to_solr['parent_sequence_id']];
if (isset($sequence_number[0]) && !empty($sequence_number[0]) && ($sequence_number[0] != 0)) {
// We do all this checks to avoid adding a strange offset e.g a collection instead of a CWS
$page_number_by_id[$extradata_from_item['search_api_solr_document']['id']] = $sequence_number[0];
}
}
@DiegoPino - in https://github.com/esmero/strawberryfield/blob/1.5.0/src/Controller/StrawberryfieldFlavorDatasourceSearchController.php#L112-L122 we have...
//search for IAB highlight
// if format=json, page=all and q not null
if (($input = $request->query->get('q')) && ($format == 'json') && ($page == 'all')) {
$snippets = $this->flavorfromSolrIndex(
$input,
$node->id(),
$processor,
$fileuuid,
$indexes
);
Do I understand the "search for IAB highlight" comment to mean that the conditions there (non-empty 'q' param, result format = 'json', page = 'all') indicate that we are searching for and intend to return ocr highlights? If so (and assuming the if
condition is valid for this purpose), then can we use this as the test to determine if the search results should be sorted by sequence ids - both sequence_id
and parent_sequence_id
?
We have a object type that we call "CWSBook", which is basically a Creative Work Series (digital object collection) object with a series of "Page" objects as children that have the individual page image files. We are able to display this in IA Bookreader using the v2.1 IIIF Manifest. OCR is indexed for these pages with both
sequence_id
andparent_sequence_id
- where the latter records the order of the Pages in the CWSBook.The problem we are having is that when a search is performed in IAB, the search results are returned sorted by relevance, and not by the page sequence, and this results in the previous/next links in the search results area to jump the user seemingly wildly back and forth.
This is because the sequence of the pages being displayed in IAB corresponds to (comes from?) the
parent_sequence_id
value indexed in solr, not thesequence_id
, whereas \Drupal\strawberryfield\Controller\StrawberryfieldFlavorDatasourceSearchController::flavorfromSolrIndex sorts only bysequence_id
, when present.I'm able to force the correct sequence in the results and correct IAB behavior in my testing with this simple change
However, I'm not sure if there are cases where this
flavorfromSolrIndex
method may be called where this sort is not the correct behavior. I also wonder if sorting bysequence_id
- the current behavior - is always correct? TheflavorfromSolrIndex
method seems to be intended to be used pretty broadly. Are there cases where only sort by relevance is correct?