-
```
data = tp.iota((3, 3, 2))
data = tp.cast(data, dtype=tp.int64)
print(data)
```
Throws the following error:
```
Traceback (most recent call last):
File "/tripy/debugging_gather.py", l…
-
Hello,
So we are interested into deploy the recently published large world model in triton server
https://largeworldmodel.github.io/
https://huggingface.co/LargeWorldModel/LWM-Chat-32K-Jax/tree…
-
### Describe the issue
I tried build with CUDA 12.5 and TensorRT 10.0 in Windows, and saw errors like `error C4996: 'nvinfer1::IPluginV2': was declared deprecated` in build.
### Urgency
None
### T…
-
Hi,The following error occurs when performing the conversion of a torch model to a tensorRT model:TypeError: forward() missing 1 required positional argument: 'multimask_output'.
But I trained th…
-
### System Info
- GPU 4 x A10G (EC2 g5.12xlarge) - memory 24GB
- TRTLLM v0.12.0
- torch 2.4.0
- cuda 12.5.1
- tensorrt 10.1
- triton 24.04
- modelopt 0.15
### Who can help?
_No response_
### Info…
-
Hey there Linamo1214,
First of all, great job with the trt. I have one question though. I have proceeded with the conversion like this.
On my laptop, running Ubuntu 22.04, without any NVIDIA GPU…
-
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1879, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_modu…
-
**Setup**
Machine: AWS Sagemaker ml.p4d.24xlarge
Model: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
Used Docker container image with the latest build of trt-llm (`0.8.0.dev2024011…
-
## Description
I tried to convert the Flux Dit model on L40S with TensorRT10.5, and found that the peak gpu memory exceeded 46068MiB, but 23597MiB gpu memory was occupied during inference. Is this n…
-
## ❓ Question
## What you have already tried
I am trying to convert a transformer model to TRT in fp16 (fp32 works fine 🙂). It includes bunch of LayerNorms, all of them have explicit casting…