NASA-PDS / planetary-data-engine

Free-text search capability for planetary data, services, tools, and information
Apache License 2.0
0 stars 0 forks source link

Restrict Sinequa Sandbox search to PDS sources #4

Closed jjacob7734 closed 1 year ago

jjacob7734 commented 1 year ago

💡 Description

The Sinequa Sandbox has been configured by the Science Discovery Engine (SDE) team with a variety of planetary sources, but only a subset are actually from PDS. This task is to restrict the sandbox search to only PDS sources so that we can more effectively assess the suitability of Sinequa for PDS search.

The acceptance of the ticket is done by compare results with the PDS legacy search

tloubrieu-jpl commented 1 year ago

@jjacob7734 knows how to filter indices, he still need o identify the one related to PDS and select them.

tloubrieu-jpl commented 1 year ago

@jjacob7734 still working on restricting the list of index.

tloubrieu-jpl commented 1 year ago

@jjacob7734 is selecting the applicable indices (node URLs), once this is done the team will review them.

jjacob7734 commented 1 year ago

I restricted the Sinequa sources to just those that originate from a PDS node web site or API, or relate to a PDS tool/service. Sources that relate to planetary science but do not have a clear connection to PDS have been excluded, even if they are NASA sites.

jordanpadams commented 1 year ago

status: this has been updated. needs review with @jordanpadams and @tloubrieu-jpl in breakout

jjacob7734 commented 1 year ago

As discussed in the breakout meeting today, I will try configuring Sinequa with a new empty index with just the PDS_API_Legacy_All source and then we can add other sources later. In that source, we added the Product_Collection data source to the two that were already there (Product_Data_Set_PDS3 and Product_Bundle.

tloubrieu-jpl commented 1 year ago

We want to create a specific index with only the legacy API in it, but there is an error. The SDE team is going to help with that but @jjacob7734 should investigate on his own.

tloubrieu-jpl commented 1 year ago

@jjacob7734 found the root cause for the error. He is going to test it next.

tloubrieu-jpl commented 1 year ago

mappings need to be configure to work with the additional end-point (collections). @jjacob7734 is working on that.

tloubrieu-jpl commented 1 year ago

@jjacob7734 will set a debug meeting with the SMD Sinequa persons to debug the indexation of the PDS legacy API.

tloubrieu-jpl commented 1 year ago

@jjacob7734 had the meeting with SMD but they were unable to fix the issue yet. The plan is to get rid of the specific plugin. The exact time for that to be fixed on SMD side is not clear.

tloubrieu-jpl commented 1 year ago

A module is missing in the sandbox deployment to make the API connection work. That should be solved by the end of week by the SMD team.

tloubrieu-jpl commented 1 year ago

No feedback from Ashish yet, @jjacob7734 will ping him again

tloubrieu-jpl commented 1 year ago

The error on the missing module is not occuring anymore, but some configuration is still required to have sinequa work with the PDS API.

@jjacob7734 asked for help but the expert is always taking some time to answer.

tloubrieu-jpl commented 1 year ago

@jjacob7734 is now unblocked on sinequa.

tloubrieu-jpl commented 1 year ago

@jjacob7734 it sounds like we can close this ticket. Can you write a note saying how that was done or link to documentation. Thanks.

jordanpadams commented 1 year ago

@jjacob7734 documenting common functions for Sinequa

jjacob7734 commented 1 year ago

The procedure to restrict Sinequa search to specific data sources or collections is documented here: https://github.com/NASA-PDS/planetary-data-engine/wiki/SDE%E2%80%90Sinequa#restricting-search-to-specific-sources-or-collections