evilsocket / cake

Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Other
2.44k stars 127 forks source link

About the reason of having cluster nodes #10

Closed hafezmg48 closed 1 month ago

hafezmg48 commented 1 month ago

Thanks for your valuable contribution. I have the following question that I need some clarification. It would probably be also noteworthy to be mentioned in the README description for more clarity. From my basic understanding, in cake we are splitting the model into its layers and distributing those layers to separate nodes because a huge 70B model will not fit into a single normal GPU. So my question is that what would be the benefit of having a cluster of these nodes on our network instead of having only a single worker and just loading and offloading each layer of model one by one? Because my understanding is that the model inference is sequential, so one node has to wait for the process of previous layers to finish to start its process. So basically having multiple nodes would appear redundant. Unless, we have some sort of pipelining mechanism that would feed batches to the nodes one at a time to perform pipelining. Is that our intention here? Could you please provide some guidance and explanation on this? Thanks again.

SEVENID commented 1 month ago
evilsocket commented 1 month ago

As per README:

The goal of the project is being able to run big (70B+) models by repurposing consumer hardware into an heterogeneous cluster of iOS, Android, macOS, Linux and Windows devices, effectively leveraging planned obsolescence as a tool to make AI more accessible and democratic.

I would also like to add: coding is fun, sometimes you do things just to see if they can be done.

evilsocket commented 1 month ago

yes it will be interesting to see how node reuse over batching can increase performances ... there's an attention layer with kv-cache already so in a way information get cached, but that would push things further if i understand correctly ... definitely food for thought! thank you @hafezmg48 :D It would be interesting to keep this conversation going on this Discord server if you want https://discord.com/invite/btZpkp45gQ