chat.load( device="cuda", use_flash_attn=True ): # 设置compile=True会卡死。

yourengod commented 1 month ago

请问，设置compile=True会卡死，这是什么原因啊？

medemi68 commented 1 month ago

I don't speak chinese however I can tell you that I encountered the same issue, it doesn't actually hang. It just takes literally 10-15 minutes on the first generation to compile on a 4090. I'll also report that the improvement in speed is very minor. For me, it took 9 seconds for an 8 second audio clip to be generated. After you set compile=True, you'll likely have to run a warmup step in order for the actual refinement to occur. That is what takes so long. It may seem like its hanging but it in fact is not. In all honesty however, the performance improvement was very minor. Going from 38it/s to 45it/s

yourengod commented 1 month ago

I don't speak chinese however I can tell you that I encountered the same issue, it doesn't actually hang. It just takes literally 10-15 minutes on the first generation to compile on a 4090. I'll also report that the improvement in speed is very minor. For me, it took 9 seconds for an 8 second audio clip to be generated. After you set compile=True, you'll likely have to run a warmup step in order for the actual refinement to occur. That is what takes so long. It may seem like its hanging but it in fact is not. In all honesty however, the performance improvement was very minor. Going from 38it/s to 45it/s

Thank you, I'll try again

2noise / ChatTTS

chat.load( device="cuda", use_flash_attn=True ): # 设置compile=True会卡死。 #797