dwyl / technology-stack

🚀 Detailed description + diagram of the Open Source Technology Stack we use for dwyl projects.
286 stars 26 forks source link

Can we Auto-suggest `tags` for `items`? #129

Open nelsonic opened 1 year ago

nelsonic commented 1 year ago

There are many potential (good) usecases for ML that can improve the lives/workflows of the people using our App. However we want to avoid going down too much of an "AI" rabbit hole before we have a clearly defined feature. Butttt .... I think it's fair to say that we all want to try something LLM-related ... 💭

So the idea with this issue is just to open the discussion. (please share your thoughts 🙏)

What are potential use-cases where we can feed data e.g. text, links or images into a model and get useful data out that will add value to the people using our App?

I've put a priority-2 on this because no person using the MVP/App has requested auto-suggestion/tagging. (yet!) We think it could be a compelling use-case to auto-categorise an item based on the content. and suggest tags and even summarise longer bits of public text e.g. someone saves a link to a blog post. It might be useful to summarise that content and pull keywords out as tags.

We definitely want to avoid "Tech/solutions searching for a problem" https://en.wiktionary.org/wiki/solution_in_search_of_a_problem 🙄

Requirements

  1. Maintain data privacy for the people using the MVP/App i.e. no sending data to OpenAI (MSFT ... 🙄)
  2. Run on comparatively cheap hardware e.g. not more than €20/month or if we need to self-host it our own hardware we can easily do this e.g: using our M1 Mac Mini: https://gist.github.com/cedrickchee/e8d4cb0c4b1df6cc47ce8b18457ebde0
  3. Document everything comprehensively so that a complete beginner can follow along.

privateGPT looks like a good starting point: https://github.com/imartinez/privateGPT If you have found an easier/simpler way of running an LLM on our own hardware/infra (e.g. Fly.io) please share 🙏

LuchoTurtle commented 1 year ago

This sounds like an awesome idea. However, I have an "implementation" question. Assuming we're using privateGPT, we'd need a source of data for suggestions. Would we create our own embeddings? Since we can't create a simple Langchain application that communicates with ChatGPT to get suggestions (we would use Guidance to get tag suggestions), I'm wondering how we're going to do this.

Perhaps I'm missing an important detail? 🤔