-
GS e2e Demo
test uses batched real inputs and produces batched outputs
this test is looped (takes several batches of inputs, produces several batches of outputs)
test evaluates results of TT …
-
### 🐛 Describe the bug
```
In [1]: import torch
In [2]: a = torch.empty((256, 512), requires_grad=True).unsqueeze(0)
In [3]: b = torch.empty((4, 128, 512), requires_grad=True).transpose(-1, -2…
-
replicating tutorials/API_10_device.ipynb, i see no load on the GPU, just the CPU. VRAM gets occupied, however
checking the device of the dataset returns "cuda", the model parameters however return "…
-
We are working on to increase supports for sparse tensor. Currently we have [summarized current state of sparse tensor](https://github.com/pytorch/pytorch/issues/9674) and listed out [sparse ops to su…
-
The home page doesn't load.
![CleanShot 2024-01-17 at 12 23 36@2x](https://github.com/bcc-code/bmm-web/assets/18753964/65ab7da2-558b-4661-8818-dc40ef8549e1)
-
I believe we don't have a test where we have `batch > 1` and bias enabled. This block of code:
in this compute kernel:
`tt_eager/tt_dnn/op_library/bmm/kernels/compute/bmm_large_block_zm_fuse…
-
### 🐛 Describe the bug
I implemented a naive attention block with matmul and softmax, then compared its performance with that of torch.nn.functional.scaled_dot_product_attention. I found a very str…
-
(update) as I mentioned in this [comment](https://github.com/tenstorrent-metal/tt-metal/issues/5168#issuecomment-1940293930), this issue is related to the TT_METAL_DEVICE_PROFILER env variable.
*…
-
hello, when i coverted my onnx model to TensorRT by the command,
`./trtexec --onnx=model.onnx --saveEngine=model.engine`
i got big diff between pytorch result and trt result. i located the problem w…
-
### System Info
transformers==4.39.3
torch==2.2.2
CUDA: 12.1 (RTX 3090 * 4)
python3.10
### Who can help?
@ArthurZucker @younesbelkada @gante
### Information
- [ ] The official example scripts…