-
oneflow 版本:0.6.0.dev20211215+cu102
运行 `run_vit_graph_success` 分支下的 `main_swin_eager_consistent_use_fake_data.sh` 脚本能复现问题 报错在 `eager_nccl_kernels.cu` 中的 `EagerNcclS2SKernel`
```bash
F1220 16:35…
-
### 🚀 The feature, motivation and pitch
Integrating oneDNN Graph Compiler into Inductor C++ Backend enables enhanced pattern fusion and performance for CPU.
### Motivation
Recent development…
-
I'm running some CTF SpMV kernels using the contraction interface (`a["i"] = B["ij"] * c["j"]`), and I'm seeing performance that is nearly 2 orders of magnitude slower than systems like PETSc and Tril…
-
# 🐛 ValueError while converting gpytorch model to ONNX
I followed this
[Converting Exact GP Models to TorchScript](https://github.com/cornellius-gp/gpytorch/blob/ff1881b5fe92147f5e34e4141c0d5255e2…
-
Hi all,
I followed the tutorial entirely, successfully trained my custom ssd_mobilenet_v2_quantized_300x300_coco model with 7 classes to detect.
The issue comes when I try to run the webcam det…
-
1.请深入矩阵乘算子的运算过程,挖掘如下可能的性能点
1.1 并行性
1.2 高效 IO
1.3 高效计算
2.考虑如下的功能点
2.1 后融合激活操作或者下一个算子
2.2 前融合前一个算子
3.提供多种计算内核的选项,例如 cuda 平台的 cuda core / tensor core;bang 平台的 张量核 / 卷积核。
-
Hello,
convert.py script failed while I was trying to convert the imagenet trained inceptionV3 model ( downloaded from https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inceptio…
-
At run time, I call
`converter.convert()
converter.build(input_fn = some_input_fn)
converter.save()`
`
And see logging to the effect of
```
tensorflow/core/grappler/optimizers/meta_optimiz…
-
I'm exploring the performance of Kokkos graphs using a CG example in the code-examples repository that runs with and without Kokkos graphs. I've been running on Vortex, on a single node, single GPU (T…
-
Suppose I have two sparse matrices:
```
G of type BCOO(float64[200, 80], nse=129)
H of type BCOO(float64[80, 200], nse=129)
```
If I do `G @ H`, I get
```
DynamicJaxprTracer[BCOO(float64[80, 80…