-
Hello, I noticed that in your code, the projection method of q, k, v is
`self.W_q = nn.Linear(d_model, 2 * self.d_head * num_heads, bias=False)`
However, in other repository I found they calculat…
zziC7 updated
2 weeks ago
-
Currently, there is some memory leak detected through valgrind. Need to address memory leak.
```
==27211==
==27211== HEAP SUMMARY:
==27211== in use at exit: 394,772 bytes in 41 blocks
==27…
-
This is a brief example that has been edited from the README.md file:
```python
import torch
import torch.nn as nn
import torch.optim as optim
from flashfftconv import FlashDepthWiseConv1d
B=4
…
-
### System Info
python==3.9
transformers==4.41.2
linux
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [X] My own modified scripts
### Tasks
- [ ] An offi…
-
# Summary
This can have large performance impact in real Attention modules.
The most common pattern (derived from nano-gpt)
```Python
import torch
import torch.nn as nn
import torch.nn.funct…
-
follow is passed.
```py
import torch
class MyModel(torch.nn.Module):
def __init__(self):
super(MyModel, self).__init__()
def forward(self, x):
x, _ = x.topk(k…
-
#### This issue is about [Algo/DS Name/Question Name](link to resource for the Algo/DS/Question)
- [x] I searched or browsed the repo’s other issues to ensure this is not a duplicate
- [x] I…
-
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
…
-
Is there any plan to compare with OpenDistro Elasticsearch K-NN?
https://opendistro.github.io/for-elasticsearch/features/knn.html
-
i tried exporting the stream conformer model to onnx format with below parameters.
```
python3 wenet/bin/export_onnx_gpu.py --config=$model_dir/train.yaml --checkpoint=$model_dir/final.pt --cmvn_fi…