Source for v2 (mobile inference engine)

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

MIT License

7.9k stars 406 forks source link

Source for v2 (mobile inference engine) #194

Open peeteeman opened 3 months ago

peeteeman commented 3 months ago

Hello there!

I came across the v2 paper yesterday, and saw the updates on the project readme.

I am interested in porting the v2 framework to iOS. The goal is to complete a naive port at first, and then include metal shaders.

Any plans on releasing the source and instructions for running v2 on Android?

0wwafa commented 3 months ago

please release PowerInfer-2 so that it can be tested on low resource PCs (like llama.cpp) for a comparison.

YixinSong-e commented 3 months ago

PowerInfer-2 will be open-sourced in the future. We’re refining it to untangle from our testing platform and making it accessible on PCs for the community.

sqzhang-jeremy commented 3 months ago

Can't wait to test your amazing work!

0wwafa commented 3 months ago

Can't wait to test your amazing work!

same here! I wish to test it on low resource pc with no gpu or an old and small one.

UUSR commented 3 months ago

This is fantastic!, on my old smartphone with 6 Gb of memory the Meta-Llama-3-8B-Instruct-Q4_K_M.gguf model ran, I hope for v2 in the near future.

Stephen888888 commented 3 months ago

when can i use it on anroid phone ?

ethanc8 commented 3 months ago

PowerInfer-2 will be open-sourced in the future. We’re refining it to untangle from our testing platform and making it accessible on PCs for the community.

Is it possible you could release the testing platform and the code entangled with the testing platform, so that the reported results can be reproduced?