-
Why use Linear Transformer rather than mamba-2, rwkv or backbone that has better performance and speed?
do we have relevant experiments?
-
**Describe the bug**
Two unit test cases of ttnn.The linear test in the PETR model Transformer submodule fails with low pcc of 0.4 and 0.6
**To Reproduce**
Steps to reproduce the behavior:
1. Ch…
-
### Feature Idea
# 🐱 Sana Model Card
## Model
We introduce **Sana**, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution.
Sana can synth…
-
There is my issue:
/home/fountain/miniconda3/bin/conda run -n Linear_Alignment --no-capture-output python /home/fountain/pycharmProjects/Linear_Alignment/demo.py
Loading checkpoint shards: 100%|███…
-
Torch reference:
https://github.com/open-mmlab/mmdetection3d/tree/main/projects/PETR
**Torch graphs:**
Transformer module: [model_petr_transformer.gv.pdf](https://github.com/user-attachments/fil…
-
### 🚀 The feature, motivation and pitch
1. [Exphormer: Sparse Transformers for Graphs](https://arxiv.org/abs/2303.06147)
2. [SGFormer: Simplifying and Empowering Transformers for Large-Graph Represe…
-
Thank you for your excellent work!
I maintain a [library](https://github.com/hp-l33/flash-bidirectional-linear-attention/blob/main/fbi_la/layers/focused_la/attention.py) implementing bi-directional…
-
### 🐛 Describe the bug
After replacing the export API from `capture_pre_autograd_graph` to `export_for_training`, We got a `ValueError: Linear partition cannot have more than one output node`.
W…
-
### System Info
peft = 0.13.2
python = 3.12.7
transformers = 4.45.2
### Who can help?
@sayakpaul
I am using ```inject_adapter_model(...)``` to finetune a model from OpenCLIP using LoRA layers…
-
File "/root/ld/ld_project/pull_request/MiniCPM-V/web_demo_2.6.py", line 44, in
model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
File "/root/ld/conda/envs/minicpm/lib/py…