Closed Dan-wanna-M closed 2 months ago
Interestingly huggingface integration run 10x slower than vllm and exllamav2 integrations. We need to know why this happens, since huggingface is still the default choice for many researchers and developers.
Fixed in v0.2.0.
v0.2.0
Interestingly huggingface integration run 10x slower than vllm and exllamav2 integrations. We need to know why this happens, since huggingface is still the default choice for many researchers and developers.