X-rayLaser / DistributedLLM

Run LLM inference by spliting models into parts and hosting each part on a separate machine. Project is no longer maintained.
MIT License
5 stars 0 forks source link

failed to solve: process "/bin/sh -c make libllama.so && make libembdinput.so" did not complete successfully: #2

Open galenyu opened 5 months ago

galenyu commented 5 months ago

Thank you for your very perfect job! When I was using 'docker-compose' to build, I encountered the following error, which may indicate that llama.cpp cannot be compiled. Do you know the solution?

image

By the way, this project does not support GPU acceleration and will no longer be maintained. Do you have any similar project recommendations?

Thanks a lot!

X-rayLaser commented 5 months ago

Hello. From the look of it it seems that somehow you are building against the newer version of llama.cpp... or maybe it's just a bug in my code. But make sure that you clone this repository with --recurse-submodules option on. This should make llama.cpp submodule point to an old commit at https://github.com/ggerganov/llama.cpp/tree/20d7740a9b45f6e5b247fa3738fdda35e18c2e8a . The code in this repo was tested against this particular version of llama.cpp.

Now, regarding the project's maintainability. Yes, you are right. It's no longer maintained and there will be no more development/bug fixing.

Other projects to consider:

  1. LLama.cpp at https://github.com/ggerganov/llama.cpp. A great open source project. They now have MPI support. Quoting from the readmy file: "MPI lets you distribute the computation over a cluster of machines. Because of the serial nature of LLM prediction, this won't yield any end-to-end speed-ups, but it will let you run larger models than would otherwise fit into RAM on a single machine."
  2. Petals at https://github.com/bigscience-workshop/petals. Another great and interesting project. I haven't used it personally, but it's worth to try. Basically, it's distributed inference running on a network of machines provided by volunteers. It should work out of box after installing it. There is also a guide on how to deploy your own swarm and run your inference on it.