attention-model Search Results

1000+ results
for attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

iree-org/iree-turbine #278

[Wave] Fun projects for beginnners

Are you interested in learning more about GPU programming and developing cool optimizations? Do you want to help build next generation and state-of-the-art machine learning models and layers? Do you w…

raikonenfnu updated 2 days ago
1
opensearch-project/ml-commons #3199

[BUG] Error deploying model, unknown built-in op

I have gone through the example: opensearch-py-ml/examples/demo_deploy_cliptextmodel.html Model is correctly registered in opensearch cluster but the final command of the example: ml_client.depl…

taborzbislaw updated 2 weeks ago
2
OpenBMB/MiniCPM-V #643

[BUG] <title> Inference error. Replacing the LLM part with L…

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing ans…

CCRss updated 4 days ago
7
NVIDIA/TensorRT-LLM #2463

Error when convert Deekseek-V2-Lite model

### System Info the offical docker env in docker/Dockerfile.multi of commit "c629546" ### Who can help? @byshiue @ncomly-nvidia I try to convert deepseek-v2-lite ``` python convert_checkpoint.py …

WhatGhost updated 2 days ago
2
NVIDIA/Megatron-LM #991

[BUG] clip key mismatch

**Describe the bug** I try to use LLaVA example and faced to key mismatch error. I am on latest commit in main branch. (094d66b) [rank0]: RuntimeError: Error(s) in loading state_dict for LLaVAMode…

KookHoiKim updated 1 month ago
2
huggingface/transformers #34809

Flex attention + refactor

Opening this to add support for all models following #34282 Lets bring support for flex attention to more models! 🤗 - [x] Gemma2 It would be great to add the support for more architectures s…

ArthurZucker updated 1 day ago
5
huggingface/transformers #34280

cross attention mask is always zeros in mllama

### System Info transformers==4.45.2 when preparing the cross_attention_mask in ```_prepare_cross_attention_mask``` function we get the``` cross_attn_mask``` to the shape of [batch,text_tokens,i…

xgal updated 2 days ago
5
AIFSH/OmniGen-ComfyUI #1

[Resolved] Error (Windows)

![image](https://github.com/user-attachments/assets/3bc230bc-5029-4657-b107-0f1a1b54be15) Error: `Phi3Transformer does not support an attention implementation through torch.nn.functional.scaled_do…

0X-JonMichaelGalindo updated 3 weeks ago
4
Lightning-AI/lightning-thunder #1405

AssertionError from process_recorded_modifications

## 🐛 Bug I get the following assertion error from Thunder JIT: ```py File ~/dev/lightning-thunder/thunder/core/jit_ext.py:1731, in thunder_general_jit(fn, args, kwargs, record_history, sharp_edges,…

IvanYashchuk updated 1 week ago
4
OpenRLHF/OpenRLHF #496

Qwen2-7B的输出用Qwen2-1.5B计算logp的时候报错

output = self.model(sequences, attention_mask=attention_mask, position_ids=position_ids) File "/root/miniconda3/envs/OpenRLHF/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553…

ZexuSun updated 13 hours ago
2

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for attention-model

1000+ results
for attention-model