attention Search Results

1000+ results
for attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

distantmagic/llmops-handbook #3

Section about optimizing LLM responses

[Outline] I would like to add a section about optimizing the speed and response times from LLMs under General Concepts. I plan to include the below topics: - Quantization - Flash attention - Arch…

devayani-kv updated 4 weeks ago
1
WICG/translation-api #11

language tag handling needs more attention

Language tag handling * https://github.com/WICG/translation-api/blob/main/README.md#for-a-known-source-language * https://github.com/WICG/translation-api/blob/main/README.md#language-tag-handling …

aphillips updated 1 month ago
6
vllm-project/vllm #7519

[Feature]: Context Parallelism

### 🚀 The feature, motivation and pitch As we can see, Google Gemini can support up to million tokens and to serve longer context length, we have to do context parallelism, which means, split the i…

huseinzol05 updated 4 weeks ago
5
FlagOpen/FlagEmbedding #1098

【code share】在微调BGE时增加 evaluation data 监测验证集指标

官方代码在训练时没有添加验证集指标，不太容易监测是否过拟合。经过尝试，增加`compute_metrics`也不行，`Trainer`的`evaluate`逻辑有点复杂走不到这里，最终还是得重构一下`evaluate`。下面分享一个很简单的重构供参考，训练过程中返回验证集的损失，只需正常添加`do_eval`、`eval_steps`、`evaluation_strategy`等参数就像。可以根据…

5663015 updated 1 week ago
1
forhaoliu/ringattention #21

Llama 3 ring attention implementation for inference

Hope you can help with this. I'm trying to implement ring attention using Llama 3 architecture and I'm starting with the blockwise parallel transformer piece. My question is when do I start to break t…

joshpopelka20gmail updated 1 month ago
1
caelan-douglas/tinybox #53

[cleanup / change] Lifter dive state

**What needs attention** A special state when entering a lifter that allows for sideways momentum, with a dive animation. Would allow angled lifters to work properly.

caelan-douglas updated 1 week ago
1
TencentARC/BrushNet #56

Add Perturbed Attention Guidance (PAG) to BrushNet pipelines

Can we have [PAG](https://ku-cvlab.github.io/Perturbed-Attention-Guidance/) integrated to the BrushNet pipeline since it seems to give extremely good results. It is already there in some [standard pip…

cs-mshah updated 2 weeks ago
2
laksjdjf/cgem156-ComfyUI #1

No additional inputs in Attention Couple node

It seems the Attention Couple only has model and base_mask inputs and your shared sample workflow there are more inputs.

wandrzej updated 1 week ago
1
icdevs/ICEventsWG #61

Draft Forum Post

**Attention** is the new currency in our information-saturated world. It shapes our worldview, drives our decisions, and ultimately affects the quality of our lives. In an era of information overload,…

ava-vs updated 1 month ago
16
InternLM/InternLM-XComposer #402

Fine tuning of quanitized internlm/internlm-xcomposer2-4khd-…

Hello, I have a question regarding fine tuning of quanitized internlm/internlm-xcomposer2-4khd-7b model. I have made quantization of 4khd model with lmdeploy, not trying to make fine tunning of thi…

zhuraromdev updated 1 month ago
3

上一页 1...24 25 26 27 28 29 30...100 下一页

1000+ results for attention

1000+ results
for attention