griptape-ai / griptape

Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
https://www.griptape.ai
Apache License 2.0
2.02k stars 170 forks source link

Support `max_characters` for text to speech models/drivers #1149

Open vachillo opened 2 months ago

vachillo commented 2 months ago

Is your feature request related to a problem? Please describe. OpenAI's tts-1 model has an input character limit of 4096. The driver should be updated to split up API calls with this max limit in mind.

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

collindutter commented 1 month ago

I'm not sure this necessarily needs to live in the framework. We don't really prevent max input size for basic Driver interactions. We could put this logic in an Engine, but that feels overkill. In the case of the PromptSummaryEngine there is some complexity in summarizing as much text as possible while maintaining a running summary. But I don't think the same complexity exists for this use-case.

Pulling a snippet you've shared:

        prompt_str = self.prompt_joiner.join(prompts).strip()
        new_prompts = [
            prompt_str[i : i + self.text_to_speech_driver.max_characters]
            for i in range(0, len(prompt_str), self.text_to_speech_driver.max_characters)
        ]
        return [self.text_to_speech_driver.try_text_to_audio(prompt=prompt) for prompt in new_prompts]