Open crohr opened 1 year ago
Sure that would be lovely. the current list of video is random just to have something. In #3 I mentioned the ability to filter by tags but there is no tagging system in place yet. Tags could be used for defining the same topics but they would also need to be built! There is a very preliminary system in place where user can edit talks/speakers so that we could get user generated content. Another option would be to get a transcript of the video and run some ChatGPT to extract tags too
Anyway whatever would be a step towards having better suggestions is warmly welcome
I was thinking of implementing similarity search with pgvector based on the description (and possibly the transcripts of the videos yes), but it seems like you're using sqlite as the db, and meilisearch for search, and I don't think either of those support vector columns. Would you be open to switch to postgres instead of sqlite?
One of my side goals (for a side project that makes a lot of side things) is to see how far we can go with an SQLite database. What are the real blockers and what are the benefits we get from such a simple stack. I am documenting this and will either present a talk on it somewhere or write articles.
Therefore I don't want to switch to Postgresql at least now.
For vector search, there is this experimental feature from Meillisearch that was just released https://github.com/meilisearch/product/discussions/621#discussioncomment-6183647
Sqlite also has this extension https://observablehq.com/@asg017/introducing-sqlite-vss
I can relate, my latest side project also uses sqlite and a simple stack to deploy (no mrsk yet but simple docker-compose + remote docker context).
I'll have a look at both solutions, thanks for the pointers!
Had a quick stab at it with meilisearch, but I can't seem to send a vector with 1536 floats (default size of OpenAI ada-002 model). Waiting for a reply on their side.
Hi @adrienpoly, when viewing a video, it might be interesting to have links to other videos on the same topic(s). Would you merge a PR that brings this feature?