-
Greetings, @cipher982!
I've seen the benchmark application https://www.llm-benchmarks.com/local and it looks great! I'm currently working on a competitive analysis of this 4 backends: Transformers…
-
In line with the main philosophy of the Symbiont app, we want to use products that are open source and provide the option for self-hosting for maximum privacy and control.
-
when i use the example in multimodel, i download the original model-liuhaotian/llava-v1.5-7b,but some error occur?
llama = from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensor…
-
ValueError: LoRA rank 64 is greater than max_lora_rank 16.
-
Is there comparison performance data between ScaleLLM and vLLM
-
Concise Description:
I'd like to use JAX for distributed training of LLMs. In addition, the new release of Keras supports JAX as a backend in addition to TF.
Describe the solution you'd like
I'd …
-
**Description**
I run benchmark of Meta-Llama-3-8B-Instruct in RTX 8*4090,
![image](https://github.com/triton-inference-server/server/assets/68674291/1a0fd341-8d8f-4893-973c-ed1ed3b74aca)
when r…
-
## Ask your question here:
Hello! I am working on an integration between Kserve/Knative with vLLM for deploying LLMs. vLLM is a production inference server for LLMs, and I have instrumented…
-
Objective: TriagerX is a novel AI-enabled software analytics tool that we developed via the IBM CAS project (with Dr. Uddin). TriagerX aims to assign an issue to components/teams and developers and to…
llxia updated
3 weeks ago
-
### Community Note
* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the…