NASA-PDS / planetary-data-engine

Free-text search capability for planetary data, services, tools, and information
Apache License 2.0
0 stars 0 forks source link

Develop Query Test Suite and Success Criteria #7

Closed jordanpadams closed 4 months ago

jordanpadams commented 10 months ago

💡 Description

tloubrieu-jpl commented 9 months ago

@jjacob7734 need to describe more what we expect from this theme, and create sub-tickets or one analysis sub-ticket to start with.

1) We need to extract the relevant user stories to define the test suite. 2) Then for these tests, compare the expected results with the actual results from Sinequa by customizing the way each fields are search for, and some weight (boost) on each of them.

jjacob7734 commented 9 months ago

User Stories: https://airtable.com/applgnSo7ROCVIjbd/shrvc3CFrOgqsfPlL/tblapwOzIaDVB1x3d My summary/taxonomy of user stories: https://docs.google.com/spreadsheets/d/1qk-setU_rJ5zLv-jlrPv7ci5Wdxp2UYwt_RPgXaiS8U/edit#gid=0 (also see the Search Function column in the AirTable above)

jordanpadams commented 8 months ago

Status: @jjacob7734 looking over story categorization made early on in task. to discuss at breakout

jjacob7734 commented 8 months ago

Notes from @jjacob7734 + @jordanpadams breakout discussion on 10/5/23:

jjacob7734 commented 8 months ago

The following are example queries from @jordanpadams where SDE/Sinequa does not produce the best results. In discussion with SDE, some of these can be improved by fixing data curation problems.

Data pages are not on top:

  1. https://sciencediscoveryengine.nasa.gov/app/nasa-sba-smd/#/search?query=%7B%22name%22:%22query-smd-primary%22,%22text%22:%22cassini%20data%22,%22tab%22:%22all%22%7D
  2. https://sciencediscoveryengine.nasa.gov/app/nasa-sba-smd/#/search?query=%7B%22name%22:%22query-smd-primary%22,%22text%22:%22cassini%20data%22,%22tab%22:%22Data%22%7D

Suboptimal duplicate pages are referenced:

  1. In https://sciencediscoveryengine.nasa.gov/app/nasa-sba-smd/#/search?query=%7B%22name%22:%22query-smd-primary%22,%22text%22:%22cassini%22,%22tab%22:%22all%22,%22select%22:%5B%5B%22treepath:%20(%60Planetary%20Image%20Galleries%60:%60%2FPlanetary%20Science%2FData%2FPlanetary%20Image%20Galleries%2F*%60)%22,%22Treepath%22%5D%5D%7D, Planetary Image Galleries is a subset / copy of the data that is here: https://photojournal.jpl.nasa.gov/
jordanpadams commented 8 months ago

Status: Investigating relevance boosting and how that can be brought into the test suite and success criteria.

tloubrieu-jpl commented 8 months ago

Create 2 sub-taks:

jordanpadams commented 7 months ago

📆 October Status: Test suite software in work. On schedule

jordanpadams commented 6 months ago

📆 November status: Test suite software in work. On schedule

jordanpadams commented 5 months ago

📆 December status: In work. Completion delayed 1 sprint. No impact on delivery.

jordanpadams commented 4 months ago

Call this done. Also have sinequa documentation here: https://github.com/NASA-PDS/planetary-data-engine/wiki/SDE%E2%80%90Sinequa