Closed alecyan1993 closed 5 months ago
Hi,
Thanks for the amazing work. Is there any way to save the compiled model and reuse it for the acceleration? Thanks!
That would quite complex since the compiled pipeline has many dynamic things. So my suggestion is to find ways to speedup the loading & compilation.
thanks for the reply!
@chengzeyi Do you have any suggestions on how to speedup the compilation? Because this step takes the most amount of time, between 30 to 45 seconds. We tested with a 4090 and 3080 on Windows (sadly Triton doesn't exist here and we are forced to use Windows, not WSL), the compilation is almost the same on these.
Hi,
Thanks for the amazing work. Is there any way to save the compiled model and reuse it for the acceleration? Thanks!