text-generation-inference Search Results

1000+ results
for text-generation-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

irthomasthomas/undecidability #676

language-model.md - vscode-docs [GitHub] - Visual Studio Cod…

- [ ] [language-model.md - vscode-docs [GitHub] - Visual Studio Code - GitHub](https://vscode.dev/github/microsoft/vscode-docs/blob/main/api/extension-guides/language-model.md) # language-model.md -…

irthomasthomas updated 8 months ago
1
rwth-i6/returnn #1466

ncclSystemError: Cannot assign requested address

In PyTorch distributed training, I get: ``` File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/torch/engine.py", line 198, in Engine.init_train_from_config …

albertz updated 11 months ago
7
intel-analytics/ipex-llm #10575

BigDL-A750-Qwen7b-Allocation is out of device memory on curr…

When I use A750 to run BigDL to load the Qwen-7b int4 model, it will show that the memory is exceeded, I don't know what's going on, is there a problem with my operation? The following is the error m…

ChenVkl updated 7 months ago
12
castorini/daam #48

Support for Prompt Embeddings Input Argument during Inferenc…

This is more of a request, but would you be able to support using custom embeddings and negative embeddings as pipeline arguments? The reason I want to do this is so I can use prompt engineering techn…

chrisprasanna updated 7 months ago
4
lllyasviel/stable-diffusion-webui-forge #538

[Bug]: Increasing batch size slows it down.[Performance]

### Checklist - [X] The issue exists after disabling all extensions - [X] The issue exists on a clean installation of webui - [ ] The issue is caused by an extension, but I believe it is caused b…

SysVR updated 7 months ago
1
vllm-project/vllm #9037

[Bug]: Error Running Llama 3.2 1B on CPU

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cpu Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A…

kunalmohan updated 3 weeks ago
8
huggingface/text-generation-inference #199

[Feature] Return embeddings

As title indicates I'd be interested in understanding whether this is just for text-generation or whether it could also be used to expose the embedding function?

darth-veitcher updated 2 months ago
32
huggingface/transformers #25778

Support for context-free-grammars (CFG) to constrain model o…

### Feature request It would be nice to constrain the model output with a CFG directly when calling `model.generate`. This is already done by llama.cpp [grammars](https://github.com/ggerganov/ll…

jvhoffbauer updated 6 months ago
18
jungwoo-ha/WeeklyArxivTalk #74

[20230305] Weekly AI ArXiv 만담 시즌2 - 8회차

scene-the-ella updated 1 year ago
5
triton-inference-server/triton_cli #86

Stream LLM

Hi. I use Llama 3, and I'd like to stream the output. I mean, it should be somehow with the 8001 port API. I'd like generate few tokens and send it to client time by time. Is it possible? It could …

EgyipTomi425 updated 1 week ago
2

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for text-generation-inference

1000+ results
for text-generation-inference