-
This issue lists all feature requests and improvements slated for the Oct 2024 Tkw release.
- [ ] Flash Attention Implementation
- [ ] Flash Attention Performance Improvements
- [x] Implicit GEMM…
-
# Bug Report
### Which model does this pertain to?
A bunch of models can't be run by ONNX Runtime. Some ops are not supported by ONNX runtime for now.
As this repo is strongly connected with …
-
## Issue description
For aarch64 linux platform, Arm Compute Library ([ACL](https://github.com/ARM-software/ComputeLibrary)) is the recommended GEMM backend for PyTorch via MKLDNN. Currently ACL is…
-
For each algorithm of MachSuite, how to generate the corresponding executable file and the necessary files configured in gem.5.cfg such as dynamic_trace.gz?
For example, I want to simulate the test_a…
-
### Request description
From Nod.ai meeting 6/30, filing new issue
### What component(s) does this issue relate to?
_No response_
### Additional context
_No response_
-
Hi,
I work with a simple onnx network exported from pytorch. The last fully connected layer (with bias) is exported as a Gemm node.
After quantization (quantize_static) with the last onnxrt versio…
ghost updated
2 years ago
-
The first step toward optimization is to know where you are now.
+ [x] Write a benchmark against numpy & julia
+ ideas
+ [ ] broadcasting -- switch to the compilation based approach, similar to …
-
Consider:
```
julia> @macroexpand @inplace C -= R*2*S
:(InplaceLinalg.C_AB!(C, 1, -R, 2, S))
```
If `R` is a matrix (or vector) then `-R` is not done inplace---and unnecesarily in any case, since…
-
I am really interested in your work. I tried to use this tool, However, I had troubles and could not solve these.
[ArtLabel_errors.pdf](https://github.com/clatfd/GNN-ART-LABEL/files/7906232/ArtLabe…
-
### System Info
ubuntu 20.04
tensorrt 10.0.1
tensorrt-cu12 10.0.1
tensorrt-cu12-bindings 10.0.1
tensorrt-cu12-libs 10.0.1
tensorrt-llm 0.10.…