-
### Describe the bug
I have launched asynchronous calls to a BentoServer deployed with a vllm backend on K8S.
I have loaded a codellama 13B in float 16.
An error occur during the call :
…
-
I don't know what I am doing wrong but it is not responding. There are no errors in the terminal but there are no responses in discord. both app and web. I have ollama open and installed. I am trying …
-
Hey all, I am getting some errors when using the OpenAI api end point:
```
INFO: ::1:51556 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request
```
Getting that somewhat often (every 15 mi…
-
The puzzle `13-97a91bb1` from #13 has to be resolved:
https://github.com/h1alexbel/fakehub/blob/f8bde4ef70b3c73a6cbd9a4570141989eebd5fe3/cli/src/args.rs#L24-L27
The puzzle was created by @h1alexbel…
0pdd updated
3 months ago
-
@h1alexbel for cli tooling we can use [clap](https://crates.io/crates/clap) wdyt?
-
## 🚀 Feature
Add activation checkpointing to [benchmark_litgpt script](https://github.com/Lightning-AI/lightning-thunder/blob/main/thunder/benchmarks/benchmark_litgpt.py).
### Motivation
Li…
-
Hi! I noticed, as soon as I kill ollama (because one can not unload models from VRAM manually) and start ollama serve on my own, all models delete themselves.
Is that a bug or a feature (perhaps en…
-
Hey @RahulSChand, Awesome work on creating this calculator. But there are some problems I am facing and getting unreliable results. Here are some of the issues I am facing:
The configurations I wil…
-
### Discussed in https://github.com/rjmacarthy/twinny/discussions/238
Originally posted by **2picus** May 5, 2024
I have Twinny installed on Ubuntu Linux 22.04, and have installed the codella…
-
Let's add a linter for markdown, especially for `README.md`. In order to control its quality