ubiquity / ubiquibot

Putting the 'A' in 'DAO'
https://github.com/marketplace/ubiquibot
MIT License
16 stars 59 forks source link

Open Source LLM #894

Open 0x4007 opened 7 months ago

0x4007 commented 7 months ago

Given all the recent developments with the OpenAI team, I am beginning research on using open source LLMs instead of OpenAI.

We just need to think through which models are the best for our planned use cases, as well as figure out where to host it.

Of course as always, if it's realistic to host on GitHub actions that would be great. But unfortunately I feel that we may need to set up some type of VPS (least desirable option due to maintenance)

Just opening up the conversation here. Will share my research results soon ish.

https://x.com/teknium1/status/1727126311578247397?s=46&t=bdMjuqzO5LYxLUsloRROxQ

gitcoindev commented 7 months ago

Anthropic rejected takeover offer, so imho either Microsoft will buy them or the board will be replaced. It is very unlikely that the OpenAI and their developed models will disappear, taken into account how much money and effort was spent already. In the worst case scenario I feel that for the open source LLM enterprise GitHub action plan would do the job, might be expensive though.

gitcoindev commented 7 months ago

Another option would be too look into Anthropic and their Claude 2 model. Their CEO was a former OpenAI's head of research and majority of their funding came from Amazon. A few details on this model are available at https://www.forbes.com/sites/sunilrajaraman/2023/11/21/openai-faces-competitive-pressure-from-anthropic/?sh=42cd65ef5352

0x4007 commented 7 months ago

In regards to the planned AI powered features, the most cognitively complex work I think will be around working with code. For example, reviewing finalized pull requests to see if the changes achieve the specification requirements before requesting reviews from our human reviewers.

The other stuff I don't think requires a state-of-the-art LLM (checking comment relevance to the specification i.e. is the comment on topic.)

What's nice about self hosting is that our costs should be very manageable as we onboard partners.

gitcoindev commented 7 months ago

Latest news: OpenAI board was replaced (my latter suggestion) https://twitter.com/OpenAI/status/1727206187077370115 , I am wondering how it unfolds.

whilefoo commented 7 months ago

Cloudflare has AI Workers but they are still in beta https://developers.cloudflare.com/workers-ai/

0x4007 commented 7 months ago

I'm very interested to build off of cloudflare infra but unfortunately they appear to only have a few models

https://developers.cloudflare.com/workers-ai/models/text-generation/

I heard really good things about https://x.com/teknium1/status/1720188958154625296?s=46&t=bdMjuqzO5LYxLUsloRROxQ

whilefoo commented 7 months ago

Of course as always, if it's realistic to host on GitHub actions that would be great. But unfortunately I feel that we may need to set up some type of VPS (least desirable option due to maintenance)

Running on GitHub actions won't work because they don't have GPU instances. (at least not for now)

  1. option: use Transformers.js. It runs on edge runtime so anything like Supabase edge functions and Cloudflare workers, but is pretty limited in terms of available models.
  2. option: use Hugging Face's Inference API. It's basically an API endpoint which executes most of the models on Hugging Face servers. The downside is the model needs to support ONNX runtime otherwise you need to convert it.
  3. option: A more advanced version of Inference API is Inference Endpoints which basically deploys a model on a virtual machine on chosen cloud provider managed by Hugging Face so no maintenance.
0x4007 commented 7 months ago

Of course as always, if it's realistic to host on GitHub actions that would be great. But unfortunately I feel that we may need to set up some type of VPS (least desirable option due to maintenance)

Running on GitHub actions won't work because they don't have GPU instances. (at least not for now)

  1. option: use Transformers.js. It runs on edge runtime so anything like Supabase edge functions and Cloudflare workers, but is pretty limited in terms of available models.

  2. option: use Hugging Face's Inference API. It's basically an API endpoint which executes most of the models on Hugging Face servers. The downside is the model needs to support ONNX runtime otherwise you need to convert it.

  3. option: A more advanced version of Inference API is Inference Endpoints which basically deploys a model on a virtual machine on chosen cloud provider managed by Hugging Face so no maintenance.

Great research. All interesting stuff!

I just signed up for access to a GPU instance on GitHub Actions Runners.