bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://petals.dev
MIT License
9.07k stars 505 forks source link

How can we make this work long term? #483

Open physiii opened 1 year ago

physiii commented 1 year ago

I do not understand how this can be stable since you are relying on donating memory and compute?

I would love to use this concept in production over AWS but I do not see how this will work long term unless you have some way of bartering.

For example, if I have a 10x more "powerful" computer (cpu/gpu/tpu etc) then you have to give 10 computes for every 1 of my computes.

This way the system is balanced otherwise you are relying on philanthropy which is not stable.

Is there any way to incorporate a memory/compute based economy?

shkarlsson commented 10 months ago

I agree. Traditional torrenting generally can rely on people donating resources because some hard drive space and network has a very low cost. Donating GPU, which is essentially what happens at petals, seems too steep of a cost for donors and doesn't seem to work in practice (judging on the short list of donors over at https://health.petals.dev).

Is there some sort of discussion or project addressing this? Maybe an equivalent to closed torrenting sites that require some minimum donation ratio? Or built in micropayments with crypto as I think @earonesty suggests here?

ELigoP commented 4 months ago

I would look from psychological perspective, too: if there would be clear objectives what users are going to do on GPUs (e.g. research, benchmarks etc.), it would attract attention and increase participation of people with GPUs.

E.g. with SETI@home, folding@home there is clearly a clear cause to donate GPU time.

If the cause would be to train LLM for specific tasks, train private+RAG LLMs, benchmark existing LLMs that will benefit individual owners of clusters - that would help.