-
hello,I want to change my target model to llava-1.5-7B, but my code reports an error:
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmStridedBatchedExFix( handle, opa…
-
Hello,
We saw the issue that a broadcast tensor from a single-dimension parameter is marked sharded by XLA sharding propagator. This sharded tensor, while doing computation with other tensor which ha…
-
Thanks for sharing your script to run the 4-bit quantized molmo-7b.
Unfortunately, I am unable to run it on my server (Ubuntu 22.04 with 2x RTX A5000 48 GB VRAM) - the error trace is below.
I wonde…
-
## 🐛 Bug
## To Reproduce
Here are two scripts for the experiment
test1.py
```
import torch
import torch_xla.core.xla_model as xm
import math
random_k = torch.randn((100, 100), dtype=…
-
(allegro) D:\PyShit\Allegro>python single_inference.py ^
More? --user_prompt "A seaside harbor with bright sunlight and sparkling seawater, with manyboats in the water. From an aerial view, the boats…
-
### 🐛 Describe the bug
Under specific inputs, `reflection_pad1d` triggered a crash.
```python
import torch
self = torch.full((9, 7, 9, 9,), 1e+13, dtype=torch.double)
padding = [-1, -1]
torch.…
-
`aten.constant_pad_nd.default` currently not support lowering to `ttnn.pad` if its pad has negative value
or it will `TypeError: __call__(): incompatible function arguments. The following argument ty…
-
Config:
Windows 10 with RTX4090
All requirements incl. flash-attn build - done!
Server:
```
(venv) D:\PythonProjects\hertz-dev>python inference_server.py
Using device: cuda
Loaded tokeniz…
-
Thank you for taking the time to review my question.
Before I proceed, I would like to mention that I am a beginner, and I would appreciate your consideration of this fact.
I am seeking assistan…
-
hello I have this error when trying the auto segmentation node
Sam2AutoSegmentation
Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator d…