Review generated chunks before merging

Self Checks

[X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find any relevant information that meets my needs. English 中文日本語 Portuguese (Brazil)
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell us your story.

When generating long texts, sometimes only a small part comes out poorly. We have to either regenerate the entire text or manually edit the bad part.

2. What is your suggested solution?

I suggest reviewing each generated chunk of audio before final merging, regenerating problematic parts, and merging with optional silence between chunks (like the ability to insert milliseconds/seconds of silences between chunks before merging). This could be done with a feature like (plus silence insertion): textgen From alltalk_tts

Additionally, optional chunking method of splitting the text on every new paragraph would be good too.

An alternate, easier method would be to allow batch generation and save each output as a separate file, either by saving each paragraph as a different file or by generating from a folder of TXT files.

3. Additional context or comments

Thank you very much for this project.

4. Can you help us with this feature?

[ ] I am interested in contributing to this feature.

fishaudio / fish-speech