This repository is meant to optimize hybrid search settings for OpenSearch. It covers a grid search approach to identify a good parameter set and a model-based approach that dynamically identifies good settings for a query.
In notebooks/1_Prepare_OpenSearch.ipynb a counter increment (for attempts) was missing, which has been added.
In notebooks/2_Index_Product_Data.ipynb replaced the pandas's iterrow() block with to_dict() as iterating through a dict is significantly faster than iterating through a pandas dataframe.
In notebooks/2_Index_Product_Data.ipynb added a function called split_into_batches() to facilitate indexing by batches.
notebooks/1_Prepare_OpenSearch.ipynb
a counter increment (forattempts
) was missing, which has been added.notebooks/2_Index_Product_Data.ipynb
replaced thepandas
'siterrow()
block withto_dict()
as iterating through adict
is significantly faster than iterating through apandas
dataframe.notebooks/2_Index_Product_Data.ipynb
added a function calledsplit_into_batches()
to facilitate indexing by batches.