-
Hello there,
I am trying to implement an iterative pruning process of the timm models.
The following code works great when iterative steps are small (i.e. up to ~30-40) but then suddenly break …
-
您的论文中:
![image](https://github.com/user-attachments/assets/90522342-f265-4852-b69b-77c35cad1095)
但是您的代码:
class MultiHeadSelfAttention(nn.Module):
def __init__(self, dim, num_heads):
s…
-
### Your current environment
```text
The output of `python collect_env.py`
```
### How would you like to use vllm
I need to extend the context length of gemma2-9b model along also with other mo…
-
#### Summary
Hi again! I encountered a bug while playing with attention and sharding in JAX. The issue occurs with specific sharding setups and fails under certain core configurations.
#### Steps …
-
Hi,
I'm simply trying to run the 'Usage' code, but this outputs a traceback:
Traceback (most recent call last):
File "/home/fnowak/Mamba Reproduction/mamba.py", line 8, in
model = Vim(
…
-
### Request Description
The mod [Chat Heads](https://modrinth.com/mod/chat-heads) displays player heads next to their chat messages. Could a mixin be added to make this use Figura avatars when Chat H…
-
Hi,
I am trying to replicate the evaluation results for AP10k test set using Vitpose+-Base as reported in the paper and the repo.
![Screenshot from 2024-10-29 17-35-51](https://github.com/user-atta…
-
Hello, @YTianZHU . I read the Differential Transformer paper and found it very interesting.
Thank you so much for your work.
I was wondering how you visualized the attention scores in Figure 1:
![Ima…
-
In the last TC meeting, we discussed the future of translations. Here are some notes:
**Current Situation**
- Translations are quite outdated across all languages.
- Currently, we rely on the com…
-
# Summary
I'm using fedora installed inside termux using proot-distro. submit50 was installed using pipx. submit50 gets stuck at a certain point when I have git commit signing enabled.
# Logs
```
…