-
I noticed that in the code.In the MSCASpatialAttention,There is the following code
def forward(self, x):
"""Forward function."""
shorcut = x.clone()
x = self.proj_1(x)
…
-
Config:
Windows 10 with RTX4090
All requirements incl. flash-attn build - done!
Server:
```
(venv) D:\PythonProjects\hertz-dev>python inference_server.py
Using device: cuda
Loaded tokeniz…
-
when I do `null_inversion.invert()`, following error occurs:
```
Traceback (most recent call last): …
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
I'm not sure issues is the greatest place to post this but I just wanted to see if anyone else had been trying this idea:
There was [a paper that came out recently](https://arxiv.org/abs/2410.05258…
-
Hello,
I'm new to transformers and coreml and I have converted the model Llama-3.2-1B-Instruct from:
https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct. to coreml model using
`python conver…
-
### Motivation.
take a look at the current llama forward computation logic:
```python
class LlamaMLP(nn.Module):
def forward(self, x):
gate_up, _ = self.gate_up_proj(x)
x…
-
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
…
-
### Your current environment
```text
The output of `python collect_env.py`
```
### How would you like to use vllm
I'm implementating a custom algorithm that requires a custom generate met…
-
比如 :https://0809zheng.github.io/2020/04/24/self-attention.html