Closed Ph0rk0z closed 1 year ago
That would be great for users with lower PC specs. Apache TVM seems promising.
Btw, I saw a similar request at Kobold: https://github.com/KoboldAI/KoboldAI-Client/issues/324
If they just use TVM without modifications maybe that is the real backend.
These benchmarks look very promising https://github.com/mlc-ai/mlc-llm/issues/15
Seems like it allows splitting model across Nvidia/AMD/Intel cards simultaneously when using Vulkan. Niche use case, but quite unique.
Finally mixing cards without having to use .cpp is definitely something. Not having to patch rocm for older AMD, etc etc.
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
Vulkan support would be really great, just like gpt4all did it at least
mlc are claiming almost 40 t/s on a split 70b with 4090s. i should try it at some point since that is double and I don't think it's the 3090/4090 difference.
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
@toxuin @KhazAkar @Ph0rk0z @Alek32x @LowYieldFire
Requesting a reopen, MLC-LLM is very easy to use for those with an AMD GPU, only needs an excellent frontend like text-generation-webui!
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
@toxuin @KhazAkar @Ph0rk0z @Alek32x @LowYieldFire
Requesting a reopen, MLC-LLM is very easy to use for those with an AMD GPU, only needs an excellent frontend like text-generation-webui!
Would support a reopen too! MLC-LLM is a great project and being able to use it with webui would be a huge quality of life improvement for AMD GPU users!
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
@toxuin @KhazAkar @Ph0rk0z @Alek32x @LowYieldFire
Requesting a reopen, MLC-LLM is very easy to use for those with an AMD GPU, only needs an excellent frontend like text-generation-webui!
It would be amazing to have this.
Please reopen this issue, since it's not definitely completed.
I would also like vulkan suport to be added and this request to be reopened. I am not entirely sure but this Project uses llama.cpp as one launcher. Wouldn't it be easy to implement a llama.cpp(and other launcher with vulkan support) specific argument that enables its vulkan support?
Requesting a reopen, MLC-LLM is very easy to use for those with an AMD GPU, only needs an excellent frontend like text-generation-webui!
GPUs without proper support like those single boards with powerfull GPUs, older cards and those with special drivers like nouveau would also greatly benefit.
LM Studio has this now, how come its still not implemented? Its in llama.cpp!
Description
https://github.com/mlc-ai/mlc-llm
https://mlc.ai/mlc-llm/docs/index.html
MLC supports a lot of other hardware and has decent AMD support. Apparently their speeds are fast and they have their own quantization?
It can do AMD/Nvidia through vulkan and the native functions (cuda/rocm) and supports intel GPUs as well. People claim good speeds.