Tencent / HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
https://dit.hunyuan.tencent.com/
Other
3.33k stars 285 forks source link

TRT自行构建engine出错 #65

Open flysssss opened 4 months ago

flysssss commented 4 months ago

环境: H100 基础镜像: docker pull pytorch/pytorch:2.3.0-cuda11.8-cudnn8-devel python3.10 步骤: 按照步骤https://hf-mirror.com/Tencent-Hunyuan/TensorRT-libs/blob/main/README_zh.md提示进行 报错: image image

CUDA runtime API error cudaErrorNoKernelImageForDevice at line 298 in file fMHAPlugin.cu CUDA runtime API error cudaErrorNoKernelImageForDevice at line 298 in file fMHAPlugin.cu 有没有明确的环境说明,最好是docker镜像,自己打镜像太费劲了

XinPeiHou commented 4 months ago

您好,fMHAPlugin.cu文件是您自己写的吗?还是从哪里下载的

flysssss commented 4 months ago

这个文件应该是tensort的某个plugin层实现吧,不清楚具体实现。

Bu-Tianxing commented 3 months ago

GPU是4080,遇到了同样的问题

scottwong1110 commented 3 months ago

同样的问题。V100