collabora / WhisperFusion

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
1.55k stars 110 forks source link

Optimize Docker Builds with Multi-Stage and Caching #43

Closed shiqimei closed 8 months ago

shiqimei commented 8 months ago

Key Takeaways:

zoq commented 8 months ago

@shiqimei this looks like an important change, should we reopen the PR?

shiqimei commented 8 months ago

Hi @zoq . Thanks for noticing this PR!

This script is problematic because it fails to properly manage the transfer of build outputs between stages. Feel free to fix it if you prefer to this approach.

Another solution I'd recommend now is:

Relocating the installation of platform-agnostic dependencies—such as git clones, curl downloads, pip packages, apt packages, and Hugging Face models like collabora/whisperspeech and charactr/vocos-encodec-24khz— to the base image.

Consequently, users will only need to compile tensorrt_llm for their CUDA devices during the Docker build process. While this method will significantly increase the base image's size, it will render the build process far more stable by minimizing dependence on various network resources to ghcr.io and local cmake builds instead only, which are substantially more reliable. My experience included 7-8 failures due to various download errors, necessitating the re-download of everything—resulting in over 300GB of redundant data—before successful completion.

Additional points to consider: