stochasticai / x-stable-diffusion

Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
https://stochastic.ai
Apache License 2.0
553 stars 35 forks source link

Can not reproduce the TensorRT result #14

Closed lileilai closed 2 years ago

lileilai commented 2 years ago

I have try the process described in this code repo about tensorrt, but i can not reproduce the TensorRT latency on A100, my fp16 result is about 3.2,greater than the statistic that you post on 。

Toan-Do commented 2 years ago

This latency is usually for the first run. You should warmup your run at least 5 iterations then measure the latency by averaging next 50 iterations (our lantency benchmark code).

You can also try to compare with our Stochasticx cli deployment which already supported TensorRT and AITemplate on A100 follow our instruction (here). The commands to deploy TensorRT/AITemplate on your machine are: