-
I had a few queries regarding the usage of tensor functions
1. When trying to run the cryptonets file, I always get the help as an output, it would be great if the command to run the file could be …
-
Hey
I know that you guys optimized this project for the A100, and i read that people got the 4090 and the 3090 running. I am only able to work with 2080s (University).
When i try to run your co…
qub3s updated
2 months ago
-
hi, @merrymercy
I am working on winograd on cuda.
I found that batched MM in your winograd is slow in nvida architecure. I guest this is because when C is large, it could not use parallel power of …
-
## Problem
onnx_legalizer.py code is hard to understand, need to improve it's readability.
## What to do
- [x] replace general `transformer.make_node` method with specialized methods, like `m…
-
**Describe the problem you are solving**
Currently, the `Tile` accessor in `BaseMatrix` is not const-accessible.
https://github.com/icl-utk-edu/slate/blob/6dbdcd2bf5702d366e6955ffab0586e2aa4eaed1…
-
On ubuntu 16.04, when running
~/repos/DeepBench/code/intel/gemm/run_mkl_igemm_ia.sh
on a single cpu machine (Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz) :
IGEMM benchmark
GEMM_S8U8S32 -
…
-
As part of adding support https://github.com/openxla/iree/pull/16854/ it might be useful to have a way to easily inject model specific optimizations that could be useful to have.
Main findings.
1)…
-
### System Info
trtllm-build --checkpoint_dir ./tllm_checkpoint_1gpu_bf16 \
--output_dir ./tmp/llama/7B/trt_engines/bf16/1-gpu \
--gpt_attention_plugin bfloat16 \
…
-
## error log | 日志或报错信息 | ログ
```
Gather not supported yet!
# axis=1
Gather not supported yet!
# axis=1
Gather not supported yet!
# axis=1
Cast not supported yet!
# to=6
Cast not supported yet…
wwdok updated
2 years ago
-
### Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
### Describe the bug
Hello, I have created the spa…