This is my first time contributing to a project on GitHub, so I may not be entirely familiar with the best practices. Nevertheless, I am committed to actively listening to feedback and enhancing my code accordingly.
This pull request addresses two issues:
Issue #43 highlighted a limitation in the current implementation where the break duration between paragraphs is only supported for azure-tts but not for edge-tts. Upon examining the source code and considering feedback from @briankendall, it became evident that the edge-tts module does not support SSML and automatically replaces newlines with spaces. To resolve this, I proposed a solution involving chunkifying the text and inserting silent sound markers ("@BRK#") between paragraphs, leveraging the pydub library. Note that this solution requires ffmpeg for silent sound generation.
Additionally, I observed that the supported voice list is outdated, missing recent updates such as "en-US-SteffanNeural." To address this, I modified the code to fetch the voice list dynamically from a Microsoft online URL instead of hardcoding it. This ensures that users have access to the latest available voices, including personal favorites like "en-US-SteffanNeural."
I am open to feedback and eager to improve my contributions to the project.
This is my first time contributing to a project on GitHub, so I may not be entirely familiar with the best practices. Nevertheless, I am committed to actively listening to feedback and enhancing my code accordingly.
This pull request addresses two issues:
Issue #43 highlighted a limitation in the current implementation where the break duration between paragraphs is only supported for azure-tts but not for edge-tts. Upon examining the source code and considering feedback from @briankendall, it became evident that the edge-tts module does not support SSML and automatically replaces newlines with spaces. To resolve this, I proposed a solution involving chunkifying the text and inserting silent sound markers
("@BRK#")
between paragraphs, leveraging thepydub
library. Note that this solution requiresffmpeg
for silent sound generation.Additionally, I observed that the supported voice list is outdated, missing recent updates such as "en-US-SteffanNeural." To address this, I modified the code to fetch the voice list dynamically from a Microsoft online URL instead of hardcoding it. This ensures that users have access to the latest available voices, including personal favorites like "en-US-SteffanNeural."
I am open to feedback and eager to improve my contributions to the project.