-
Several of my applications use bfloat types, including applications intended to test the bfloat hardware on the CGRA. What is the plan for supporting bfloat16_t?
-
https://github.com/pytorch-labs/segment-anything-fast/ uses [custom Triton code](https://github.com/pytorch-labs/segment-anything-fast/blob/main/segment_anything_fast/flash_4.py) to implement a varian…
-
Just came across https://arxiv.org/abs/2403.12278 🎉 And I see that you're actually using Julia. Just to point you to [StochasticRounding.jl](https://github.com/milankl/StochasticRounding.jl) which im…
-
### 🚀 The feature
currently the examples under references only support default datatype (float32), can we support a argument like --data-type to allow user to specify the datatype for the model?
###…
-
1. 使用docker pull keyk13/ape_image:v1 拉取了在以上问题中提供的镜像
2. 但是在容器中没有找到xformers库,pip install xformers 会安装0.0.23版本,自动更新torch版本;如果安装0.0.17版本,会有以下报错
NotImplementedError: No operator found for `memory_efficie…
-
**Please Describe The Problem To Be Solved**
在 Ubuntu 22.04 运行
``` bash
sudo ./run.sh -c local -i 1 -b vllm -m Qwen-7B-QAnything -t qwen-7b-qanything -p 1 -r 0.85
```
报错:
qanything-cont…
ghost updated
4 months ago
-
We observed good overlap with FSDP + PGLE:
![Bq7PCuqyJbygSuL](https://github.com/user-attachments/assets/0cff27c4-6499-43d0-b436-ef01a2833ae0). Turning on and off PGLE makes a big difference here.
…
-
## 🚀 Feature
The program:
```
class DynamoModule(torch.nn.Module):
def forward(self, L_intermediate_parallel_ : torch.Tensor, L_self_modules_dense_4h_to_h_parameters_weight_ : torch.nn.
p…
-
In UNet Shallow, the output tensor from the final layer is of shape=`[1, 1, 337920, 1[32]]`. It is in TILE layout but it would be faster if I converted it to RM layout (to eliminate the padding) befor…
-
如题,请问怎么解决呢