kalavai-net / kalavai-client

A platform to crowdsource AI computation
https://kalavai-net.github.io/kalavai-client/
Apache License 2.0
68 stars 4 forks source link

Pascal GPUs require patches for vLLM and Triton #20

Open the-crypt-keeper opened 1 month ago

the-crypt-keeper commented 1 month ago

Triton does not officially support SM60 or SM61 GPUs anymore. This includes the datacenter P40, P100 and P102 cards the Quadro P5000 and the GTX1080 family.

https://github.com/triton-lang/triton/issues/2780

Additionally vLLM requires some patches to play nice with P40's SM60 architecture, while aphrodite-engine seems to work OK out of the box.

Patches for Triton and vLLM are available as wheels here: https://github.com/sasha0552/pascal-pkgs-ci

musoles commented 1 month ago

@the-crypt-keeper thanks for opening an issue! I'll take a look at the limitations.

Since aphrodite seems to work fine, could you test it within Kalavai? Here's a guide to do so (note it does not have to be a GGUF model, you can plug in whatever)