attention-model Search Results

1000+ results
for attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #9551

[Usage]: Custom LLM Generate

### Your current environment ```text The output of `python collect_env.py` ``` ### How would you like to use vllm I'm implementating a custom algorithm that requires a custom generate met…

Blaizzy updated 1 month ago
12
Anjok07/ultimatevocalremovergui #1632

Runtime Error - Ensemble Mode htdemucs_ft

Last Error Received: Process: Ensemble Mode If this error persists, please contact the developers with the error details. Raw Error Details: RuntimeError: "Invalid buffer size: 35.38 GB" …

Seanux77 updated 3 days ago
1
NVIDIA/Megatron-LM #1259

[BUG] Flash attention cannot be applied by passing the --use…

Passing the --use-flash-attn flag is intended to enable flash attention; however, when the --use-mcore-models flag (to use the transformer engine) is also specified, flash attention will not be applie…

efsotr updated 3 weeks ago
1
facebookresearch/sam2 #329

Seeking Guidance: Addressing Performance-Related Warning Mes…

Thank you for taking the time to review my question. Before I proceed, I would like to mention that I am a beginner, and I would appreciate your consideration of this fact. I am seeking assistan…

eanzero updated 6 days ago
7
axolotl-ai-cloud/axolotl #2068

Deepspeed zero3 + LoRA: RuntimeError: Only Tensors of floati…

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports. ### Exp…

bursteratom updated 1 week ago
1
BerriAI/litellm #6709

[Feature]: Triton embedding custom input params

### The Feature To support custom input params for Triton embedding server. ### Motivation, pitch Currently the input payload params of the Triton Embedding model call is fixed with below for…

suresiva updated 2 weeks ago
1
BAAI-DCAI/Bunny #138

Question about deepspeed checkpoint loading

I tried to load Lora training adapters from Deepspeed checkpoint: dir: ``` ls Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2/checkpoint-6000 total 696M -rw-r--r-- 1 schwan46494@gmail.c…

Wintoplay updated 1 week ago
1
Lightning-AI/lightning-thunder #1407

HF Qwen 2 with Thunder returns a slightly different loss fun…

## 🐛 Bug We need to determine whether Thunder has real accuracy problems computing HF's Qwen 2 model. The test added in https://github.com/Lightning-AI/lightning-thunder/pull/1406 might fail bec…

IvanYashchuk updated 1 week ago
1
huggingface/tokenizers #1688

Llama-3.2 offset-mapping needs fixing

Very similar to the issues here ([#1553](https://github.com/huggingface/tokenizers/issues/1553), [#1517](https://github.com/huggingface/tokenizers/issues/1517)), but for the newest Llama models the of…

kyrawilson updated 1 day ago
1
unslothai/unsloth #1326

qwen2-vl 2b 4-bit always getting OOM, yet llama3.2 11b works…

qwen2-vl has always been memory hungry (compared to the other vision models) and even with unsloth it still OOMs when the largest llama3.2 11b works fine. I'm using a dataset that has high resolution…

mehamednews updated 1 day ago
3

上一页 1...10 11 12 13 14 15 16...100 下一页

1000+ results for attention-model

1000+ results
for attention-model