-
### 🐛 Describe the bug
With a 2D spatial neighborhood pattern, flash attention is orders of magnitude slower than dense attention:
hlc=2
seq_length : 192
flex attention : 0.0015106382369995117 […
-
这样很多情况下可以自定义 不用等更新 更进一步可以共享读取自定义多音字文件
比如说粗话 操字 一般要读4音
另外很多繁体字
比如 “着”都写作“著” 惯用字写法读音和大陆不一样
-
### What version of gRPC and what language are you using?
pecl-gRPC 1.58.0 ( I cannot upgrade to a more recent version: https://github.com/grpc/grpc/issues/36025)
### What operating system (Linux,…
-
## heap-base overflow of erofs-utils
### project
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
### env
tested on fedora 37
### version
erofs-utils v1.6
### r…
-
#### Version
8.10+alpha (#9078 / https://github.com/ppedrot/coq/commit/6ded8df5d5df4b302b869328b6803c3d7f579a67)
#### Operating system
Linux
#### Description of the problem
```coq
(* -…
-
When using `by` with many small groups, runtime can be slow. For some situations like `cumsum` and `frank`, we could perform it on the whole vector first and use the last value from the previous group…
-
Hi! I'm running Ganon on one of my datasets and for some reason it starts profiling but gets stuck forever. This dataset is also run in other profilers and I have no problems with it, so I guess that …
-
### What is the issue?
I use Proxmox VE for virtualization. If I install ollama in a Linux VM it works fine. If I install Ollama in a LXC (Host Kernel 6.8.4-3) it don't works with CPU.
#####
olla…
-
在一些Loss中能看到添加了额外的一个向量的拼接if concatenation_sent_max_square: torch.max(rep_a, rep_b).pow(2),请问有实验对应的结果吗?SENTENCE-TRANFORMERS的默认拼接就如论文所引用concat(u, v, |u-v|),已经在大量实验上证明其有效性(更好的句子语义相似表示),不知道如寐建议的这个trick的出处或…
-
Hi, I have this problem.
I've been following the tutorial and already install ROS-Kinetic and V-REP pro edu, version 3.3.1 in Ubuntu 16.06 x64
When I install vrep-ros-bridge i do catkin_make and do …