horahoradev / PrometheusTube

Steal fire (videos) from the Gods (other video sites)
https://prometheus.tube
BSD 2-Clause "Simplified" License
74 stars 2 forks source link

Recommender System Improvements #1

Open horahoradev opened 9 months ago

horahoradev commented 9 months ago

Context and Motivation

We currently use Gorse as our recommender system.

Currently, Gorse maintains a view of our video ecosystem without using the same sources of truth (Videoservice db in Postgres). It maintains its own database with its own schema, and we call the relevant API methods to manage the state of our entities within Gorse so that it can provide us with video recommendations.

The important code snippet is in https://github.com/horahoradev/PrometheusTube/blob/main/backend/video_service/internal/models/recommender.go#L102 , which is the main recommender code (this file is horribly messy in a general sense, I or someone needs to clean it up rofl).

If user ID is 0, it's assumed to be an anonymous user (nice API bucko), and we provide a mix of most popular videos and nearest neighbors as recommendations. If the user ID is non-zero, it's an authenticated user, and we can provide personalized user recommendations based on user signals (see below diagram for an overview of how this works within Gorse).

image

What really concerns me here, beyond general code cleanliness, is https://github.com/horahoradev/PrometheusTube/blob/main/backend/video_service/internal/models/recommender.go#L121 . For every request to an individual video page, we end up making 20+ queries about video details, making video requests unnecessarily slow

There are a few possible solutions here:

  1. we could batch the query
  2. embed the information we need into gorse itself (but that presents its own problems, because we need to periodically synchronize from our source of truth; yuck!).

Prerequisites

I'd recommend learning about the following topics before approaching this:

Goals

  1. Cleanup recommender system usage
  2. remove/optimize per-recommendation SQL query
Sanket-Arekar commented 9 months ago

Hey @horahoradev I would like to work on this issue. Can you Please assign me this issue?

horahoradev commented 9 months ago

this one is going to be really tough, i'll probably have to work with you on this