coqui-ai / TTS

πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
31.64k stars 3.78k forks source link

[Feature request] Allow the use of `logging` instead of `print` #3729

Open christophertubbs opened 1 month ago

christophertubbs commented 1 month ago

πŸš€ Feature Description

The print function is in several places, most noticeably (to me) is in utils.synthesizer.Synthesizer.tts, with lines like:

        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

This is great when messing around, but it'd be nice to have the option to use different types of loggers (or even just the root). For instance, if I have a distributed application, I can have this writing to something that would send the messages through a pubsub setup so that another application may read and interpret the output in real time.

Solution

utils.synthesizer.Synthesizer's signature can be changed to look like:

    def __init__(
        self,
        tts_checkpoint: str = "",
        tts_config_path: str = "",
        tts_speakers_file: str = "",
        tts_languages_file: str = "",
        vocoder_checkpoint: str = "",
        vocoder_config: str = "",
        encoder_checkpoint: str = "",
        encoder_config: str = "",
        vc_checkpoint: str = "",
        vc_config: str = "",
        model_dir: str = "",
        voice_dir: str = None,
        use_cuda: bool = False,
        logger: logging.Logger = None
    ) -> None:

and the tts function can look like:

    if self.__logger:
        self.__logger.info(f" > Processing time: {process_time}")
        self.__logger.info(f" > Real-time factor: {process_time / audio_time}")
    else:
        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

A Protocol for the logger might work better than just the hint of logging.Logger - it'd allow programmers to put in some wackier functionality, such as writing non-loggers that just so happen to have a similar signature.

Alternative Solutions

An alternative solution would be to pass the writing function to tts itself, something like:

    def tts(
        self,
        text: str = "",
        speaker_name: str = "",
        language_name: str = "",
        speaker_wav=None,
        style_wav=None,
        style_text=None,
        reference_wav=None,
        reference_speaker_name=None,
        split_sentences: bool = True,
        logging_function: typing.Callable[[str], typing.Any] = None,
        **kwargs,
    ) -> List[int]:

    ...

    if logging_function:
        logging_function(f" > Processing time: {process_time}")
        logging_function(f" > Real-time factor: {process_time / audio_time}")
    else:
        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

This will enable code like:

def output_sound(text: str, output_path: pathlib.Path, connection: Redis):
    from TTS.api import TTS
    speech_model = TTS(DEFAULT_MODEL).to("cpu")
    speech_model.tts_to_file(text=text, speaker="p244", file_path=str(output_path), logging_function: connection.publish)

Additional context

I don't believe that utils.synthesizer.Synthesizer.tts is the only location of the standard print function. A consistent solution should be applied there.

The parameter for the logging functionality will need to be passed through objects and functions that lead to the current print statements. For instance, TTS.api.TTS.tts_to_file would require a logging_function parameter if it were to the function to self.synthesizer.tts within the tts function.

The general vibe of the solutions I've provided will make sure that pre-existing code behaves no different, making the new functionality purely opt-in.

I haven't written anything using a progress bar like the one that this uses, so I can't speak up for that aside from the fact that it might need to be excluded.

eginhard commented 1 month ago

In our fork (pip install coqui-tts) all prints have been switched to Python logging. Feel free to try it out and let us know if it works for you. (also a duplicate of #1691)

christophertubbs commented 1 month ago

Thanks for your work there! I wasn't aware that TTS was essentially dead here when I posted. Is there anything I need to know when migrating over to your version?

eginhard commented 1 month ago

No, there aren't any major changes and you can use it in the same way.

stale[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.