-
The `get_model_cls_by_arch_name` introduced in [Dynamic model class loading PR](https://github.com/sgl-project/sglang/pull/101) removes the hard-coded mapping between `MistralForCausalLM` and `LlamaFo…
-
Could this work with an OpenAI compatable vendor API like together or firework? I would like to use mixtral but would rather not host it. I tried but the openai implementation is hardcoded for openai …
-
Hello, team!
Thanks for the excellent work.
When working batch inference, sometimes encountering server-side error that completely interrupts the process:
```
new fill batch. #seq: 7. #cached_to…
-
Hi! I just wanted to ask if it is possible to use OpenAI compatible API in sglang with local models and to force a certain json schema. IIRC original openai api only supports `{ "type": "json_object" …
-
Update:
* Please see #6801 for major items in performance sprint.
* Please see #8779 for major items in a new architecture aim at simplicity and performance.
* We are in the feedback gathering pha…
-
I ran into the following error when running my code:
Code
```
import sglang as sgl
sgl.set_default_backend(sgl.RuntimeEndpoint("http://localhost:30000"))
@sgl.function
def test_function(s,…
-
I often need an LLM to generate scalar values.
I look at the logprobs of True and False and do True - False.
Works very well for me:
```
f'{output_text}\n\nRate whether the text is well written.',…
-
Im trying to use this on databricks inside the notebook that's running on top of a 8xA10 single node cluster, I'm initialising like:
```
from sglang import function, system, user, assistant, gen, …
-
### System Info / 系統信息
absl-py 2.1.0
accelerate 0.33.0
aiofiles 23.2.1
aiohappyeyeballs 2.3.4
aiohttp …
-
Thanks so much for the work on this repo so far.
I think prefix caching could be very useful and I see that vLLM is also starting to support it for some architectures.
It looks like the [BaseBac…
pj-ml updated
9 months ago