-
Hello, I encountered some problems when loading the Llama3.2-90B-Vision-Instruct model with FP8. Can you help me take a look?
Version of llama_stack and llama_models:
```
llama_models == 0.0.41
…
-
**Is your feature request related to a problem? Please describe.**
I am aware that PyTriton already have an example for using PyTriton with tensorrt_llm. But I noticed that the example only support s…
-
### Bug summary
When running DeepMD-kit, an error occurs related to a mismatch in tensor shapes. Specifically, the error message indicates that the shape of the input tensor does not match the expect…
-
In order to reduce the memory usage, I use optimize.quanto to quantize transformer, controlnet, and t5encoder in fp8, but I encounter an error
```
File "/home/yongfang/miniconda3/envs/diffusers/l…
-
Take this file as input:
```
func.func @conv2d_accumulate_2_32_32_32_times_3_3_64_dtype_i1_i1_i1(%lhs: tensor, %rhs: tensor, %acc: tensor) -> tensor {
%result = linalg.conv_2d_nchw_fchw {dilations =…
-
I encountered a bug in dorado that causes it to crash
When basecalling multiple models and passing POD5 data files from different runs together, the program crashes everytime
dorado basecaller s…
-
I tried to run the training code on both datasets, but both were evaluating that line of code for errors.
Error 205 line code: vqa_result = evaluation(model_without_ddp, test_loader, device, config…
-
```
template
CUTLASS_DEVICE void
mma(Params const& mainloop_params,
MainloopPipeline pipeline_k,
MainloopPipeline pipeline_v,
PipelineState& smem_pipe_read_k…
-
**System Information**
OS: Ubuntu 22.04 (via docker)
Arch: grayskull e75
Commit: https://github.com/tenstorrent/tt-forge-fe/commit/685c8954bc0bd93e00bf84fe68a2cd65063a67c0 (latest)
**Problem**…
-
When the model uses blfloat16 ops, the optimizer fails with the following. We should handle custom types form onnx in `_constant_folding`
```pytb
Traceback (most recent call last):
File "/works…