-
**Description**
I run benchmark of Meta-Llama-3-8B-Instruct in RTX 8*4090,
![image](https://github.com/triton-inference-server/server/assets/68674291/1a0fd341-8d8f-4893-973c-ed1ed3b74aca)
when r…
-
Objective: TriagerX is a novel AI-enabled software analytics tool that we developed via the IBM CAS project (with Dr. Uddin). TriagerX aims to assign an issue to components/teams and developers and to…
llxia updated
1 month ago
-
### Before submitting your bug report
- [ ] I believe this is a bug. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions
- [ ] I'm not able to find an [open issue](ht…
-
### Discussed in https://github.com/kserve/kserve/discussions/3097
Originally posted by **Peilun-Li** August 25, 2023
There are some emerging serving runtimes dedicated to LLM hosting, e.g., v…
-
### When the type of context in the incoming messages is text, an error occurs.
**API**: `/v1/chat/completions`
### request
```json
{
"max_tokens": 0,
"model": "qwen-72b-chat-int4"…
-
### 🚀 The feature, motivation and pitch
in the Mteb leaderboard, the current best embedding model is `Alibaba-NLP/gte-Qwen2-7B-instruct`.
However, using the embedding endpoint on it returns the foll…
-
**Describe the bug**
downloadOllama.js windows absolute url 404's.
**To Reproduce**
+ First time installing - installed as admin
+ Open Reor and immediately get this error message:
```
Error: …
-
I ran into a series of issues trying to get VLLM stood up on a system with multiple MI210s. I figured I'd document my issues and workarounds so that someone could pick up the baton later, or at least …
-
- [ ] [LoRA Land: Fine-Tuned Open-Source LLMs that Outperform GPT-4 - Predibase - Predibase](https://predibase.com/blog/lora-land-fine-tuned-open-source-llms-that-outperform-gpt-4)
# LoRA Land: Fine…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
Hello team, I use Ollama service to handle LLM server, and I use **Llama 3**
I used the…