-
**Describe the bug**
Query_input's shape is [batch, pos, n_heads, d_model], and the purpose of the code where the error occurred is to reshape query_input to [batch, pos, n_heads, d_head].
I found t…
-
### Feature request
Hi! I’ve been researching LLM quantization recently ([this paper](https://arxiv.org/abs/2405.14852)), and noticed a potentially improtant issue that arises when using LLMs with 1-…
-
I've logged a CPython PR that adds a private [`pathlib._VirtualPath`](https://github.com/python/cpython/blob/e4daac9c27ce1ba2e7a7e0dbbf29e4cc17a32358/Lib/pathlib.py#L788) class:
https://github.com/…
-
Hi. I have noticed that the /TKO signal is not behaving as I would expect it to in the floppy drive signals tests. When the heads are stepped inwards, I would expect /TKO to be negated (not active) an…
-
**Is your feature request related to a problem? Please describe.**
Github commits don't support user verifiable cryptographic signatures. A Heads user downloading the latest commit is trusting solely…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
The precondition of `split_kv` on varied length forward is `seqlenq_ngroups_swapped=True`
https://github.com/Dao-AILab/flash-attention/blob/6df7e0a02edcee851744168079377a039f6d728d/csrc/flash_attn/fl…
-
[https://[raw.githubusercontent.com/666OS/YYDS/refs/heads/main/mihomo/config/MihomoPro_icon.yaml](https://raw.githubusercontent.com/666OS/YYDS/refs/heads/main/mihomo/config/MihomoPro_icon.yaml)](url)
…
-
A suggestion to add heads on stakes so that multiple groups can display the heads of those who have wronged them. Whether it be the king displaying the head of a traitor/thief, or the bandits displayi…
-
SparseAttention3D is not use.
` self.attn2= SparseAttention3D(dim,4,num_heads=num_heads,patch_size=patch_size,stride=stride)`
self.attn2 is definde,but is not use