NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
Apache License 2.0
758
stars
113
forks
source link
[BUG] The multi-stage example is broken due to recent changes in systems #1042
We are getting the error below due to some changes in faiss op in Systems from this unit test:
E testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E ------------------
E from merlin.systems.dag.ops.faiss import QueryFaiss, setup_faiss
E
E item_embeddings = np.ascontiguousarray(
E pd.read_parquet(os.path.join(BASE_DIR, "item_embeddings.parquet")).to_numpy()
E )
E setup_faiss(item_embeddings, faiss_index_path)
E ------------------
E
E
E ---------------------------------------------------------------------------
E IndexError Traceback (most recent call last)
E Cell In[10], line 6
E 1 from merlin.systems.dag.ops.faiss import QueryFaiss, setup_faiss
E 3 item_embeddings = np.ascontiguousarray(
E 4 pd.read_parquet(os.path.join(BASE_DIR, "item_embeddings.parquet")).to_numpy()
E 5 )
E ----> 6 setup_faiss(item_embeddings, faiss_index_path)
E
E File /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/faiss.py:209, in setup_faiss(item_vector, output_path, metric, item_id_column, embedding_column)
E 192 def setup_faiss(
E 193 item_vector: DataFrameLike,
E 194 output_path: str,
E (...)
E 197 embedding_column="embedding",
E 198 ):
E 199 """
E 200 Utiltiy function that will create a Faiss index from a set of embedding vectors
E 201
E (...)
E 207 target output path
E 208 """
E --> 209 ids = item_vector[item_id_column].to_numpy().astype(np.int64)
E 210 item_vectors = np.ascontiguousarray(
E 211 np.stack(item_vector[embedding_column].to_numpy()).astype(np.float32)
E 212 )
E 214 index = faiss.index_factory(item_vectors.shape[1], "IVF32,Flat", metric)
E
E IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
@rnyak The change that led to this failure was motivated by the updates to the multi-stage notebook, and #1022 contains corresponding updates to make use of the changes to setup_faiss
Bug description
We are getting the error below due to some changes in faiss op in Systems from this unit test:
Steps/Code to reproduce bug
You can run this unit test to repro the error.
Expected behavior
Environment details
Additional context
There is an open PR in Merlin to update the
building_deploying_multi_stage_RecSys
example, hopefully that might address this issue.