attention-model Search Results

1000+ results
for attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

comfyanonymous/ComfyUI #4673

The flux dev model outputs pure black on 4090

### Expected Behavior Troubleshoot the flux-fp8-dev black graph ### Actual Behavior Running flux-fp8-dev e4m3fn on 4090 using Vincennes graphs will always be black, it still appears after disabling…

alex13by updated 2 days ago
10
LLaVA-VL/LLaVA-NeXT #254

Model perform well when using flash_attention_2 or SDPA, but…

I found this issue when working with the lmms-lab/llava-onevision-qwen2-7b-ov model and qwen2vl.（the transformers library is the latest version.） ### Code ```python import json import argparse…

WellDonePF updated 1 week ago
7
huggingface/huggingface_hub #2553

[Feature request] Papers API

I can do the following to search for papers: `curl 'https://huggingface.co/api/papers/search?q=attention'` And I get this: >[{"id":"2409.07146","title":"Gated Slot Attention for Efficient Linear…

nbroad1881 updated 2 weeks ago
5
sgl-project/sglang #1487

Development Roadmap (2024 Q4)

Here is the development roadmap for 2024 Q4. Contributions and feedback are welcome ([**Join Bi-weekly Development Meeting**](https://t.co/4BFjCLnVHq)). Previous 2024 Q3 roadmap can be found in #634. …

Ying1123 updated 8 hours ago
1
luckyhzt/LVCD #2

PermissionError: [Errno 13] Permission denied

I’m giving up. The files are writable and readable, but the error still appears. Nothing seems to fix it. --------------------------------------------------------------------------- PermissionEr…

TylusKinas updated 1 day ago
4
LLaVA-VL/LLaVA-NeXT #194

About LLaMA-3-LLaVA-NeXT-8B: The attention mask and the pad …

I am trying to use the llama3-llava-next-8b model, and I replaced --model-path with the local path of llama3-llava-next-8b that I downloaded. When I run python -m llava.serve.model_worker --host 0.0…

wangtong627 updated 3 weeks ago
1
UKPLab/sentence-transformers #2935

Train and embed with pretokenized input?

Hi, I'm new to NLP, and I am currently trying to finetune jina for text similarity comparison. I construct a dataset with columns `sentence1`, `sentence2` and `score`. And I can easily train the mod…

shijy16 updated 2 weeks ago
3
TransformerLensOrg/TransformerLens #737

[Bug Report] Q cannot be reshaped correctly when model is lo…

**Describe the bug** Query_input's shape is [batch, pos, n_heads, d_model], and the purpose of the code where the error occurred is to reshape query_input to [batch, pos, n_heads, d_head]. I found t…

po13on updated 5 days ago
2
Vahe1994/AQLM #128

Flash attention 2 doesn't work

Hello! The `main` (`a441a3f`) branch of the AQLM repository does not support `flash attention 2`. The error occurs because QuantizedWeight does not have a weight attribute ([closed issue #31](https…

ArtemBiliksin updated 1 day ago
1
AIDC-AI/Ovis #19

Failed to process batch: Currently, only support `batch_size…

When will support for batch size > 1 be available, or where should I make modifications to enable this feature?

Phoenix724 updated 1 week ago
5

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for attention-model

1000+ results
for attention-model