NASA-PDS / planetary-data-engine

Free-text search capability for planetary data, services, tools, and information
Apache License 2.0
0 stars 0 forks source link

Create test cases for evaluation of PDS search quality #14

Open jjacob7734 opened 8 months ago

jjacob7734 commented 8 months ago

💡 Description

Create a query test suite that can be used to evaluate the quality of a search technology. The test queries should be representative of the types of searches that are done by real PDS users. Refer to the PDS User Stories for motivation: https://airtable.com/applgnSo7ROCVIjbd/shrvc3CFrOgqsfPlL/tblapwOzIaDVB1x3d. For each test query, provide the query string and itemize any documents or web pages that are known to be highly relevant and should appear near the top of any successful search result set.

tloubrieu-jpl commented 7 months ago

@jjacob7734 is comparing the search results from keyword search and Sinequa considering keyword search as the ground truth.

tloubrieu-jpl commented 7 months ago

@jjacob7734 after our work session today:

We assume the web page search is going to be well handled by sinequa so we don't include these tests in the test suite.

We focus on the search results coming from the Solr legacy registry indexed in Sinequa with search on various entries of the PDS4 model (target, investigation, instrument_host, instrument, bundle, collection, document) with variations on the search, e.g.:

See initiated spreadsheet: https://docs.google.com/spreadsheets/d/1H3x8YSGlxW6yEpcyqYjr2IB-PieH4hYcpO3mrm0BHBg/edit#gid=0

tloubrieu-jpl commented 7 months ago

We don't need to work on facets yet.

tloubrieu-jpl commented 6 months ago

@jjacob7734 made some progress with the test suite but he needs feedback on the order of results expected.

tloubrieu-jpl commented 6 months ago

Some updates have been made to the spreadsheet.

jordanpadams commented 6 months ago

Status: @jjacob7734 To create a new task to add additional data sources to Sinequa for landing pages. Issues logging in right now.

jordanpadams commented 4 months ago

Rolling this back to the icebox for the time being until we can restart this task