This PR contains a sample code for training and serving embeddings for real-time similarity matching. The system utilizes BigQuery ML Matrix Factorization model to train the embeddings, and the open-source ScaNN framework to build and approximate nearest neighbour index.
Compute pointwise mutual information (PMI) between items based on their cooccurrences.
Train item embeddings using BigQuery ML Matrix Factorization, using item PMI as implicit feedback.
Export and post-process the embeddings from BigQuery ML model to Cloud Storage as CSV files using Cloud Dataflow.
Implement an embedding lookup model using Keras and deploys it to AI Platform Prediction.
Serve the embedding as an approximate nearest neighbor index using ScaNN
This PR contains a sample code for training and serving embeddings for real-time similarity matching. The system utilizes BigQuery ML Matrix Factorization model to train the embeddings, and the open-source ScaNN framework to build and approximate nearest neighbour index.