-
### Describe the bug
> RuntimeError: The size of tensor a (154) must match the size of tensor b (2304) at non-singleton dimension 1
### Reproduction
```python
# StableDiffusion3Pipeline
pipe.enab…
-
-
I reviewed the code of modeling_qwen.py, and I noticed that, within the lookahead process, the draft_ids matched from the TrieTree are such that the attention_mask and position ids associated with the…
-
## 🐞Describing the bug
Hello. I'm trying to convert PyTorch model to Stateful CoreML Model
I wrote this code referred to [WWDC 2024 session Mistral-7B model](https://github.com/huggingface/swift-t…
-
Hello, can you send a code about the lack of "attention fusion" for reference
-
I used [sd-perturbed-attention](https://github.com/pamparamm/sd-perturbed-attention) and had clear, mostly positive results with it. Integrated Perturbed-Attention Guidance was presented with limited …
-
请问微调训练支持V100服务器吗,8*32G?V100原生不支持Flash Attention?
-
### System Info
L4 GPU (AWS G6.12xl) with TensorRTLLM 0.11.0, running with Tritonbackends
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [ ] My own modified …
-
### Describe the bug
When IP adapter is loaded, the IPAdapterAttnProcessor2_0 class is not set with training=False. Didnt see a huge difference in memory if its manually set to false on those process…
-
### Bug description
It seems that they updated the Gemma v1 2B weights. Something to look into:
```
⚡ main ~/litgpt litgpt chat checkpoints/google/gemma-2b
{'access_token': None,
'checkpoint_…
rasbt updated
1 month ago