Elevenlabs-langchain API should integrate very important parameter 'voice'

agilebean commented 8 months ago

Feature request

The langchain API integration Elevenlabs should accomodate the very important parameter voice.

Motivation

Without the voice parameter, the Elevenlabs langchain integration is unusable because play() switches the voice randomly with every execution.

Your contribution

I tracked down the releveant code in the Elevenlabs-langchain integration API down to this section:

def _run(
        self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None
    ) -> str:
        """Use the tool."""
        elevenlabs = _import_elevenlabs()
        try:
            speech = elevenlabs.generate(text=query, model=self.model)
            with tempfile.NamedTemporaryFile(
                mode="bx", suffix=".wav", delete=False
            ) as f:
                f.write(speech)
            return f.name
        except Exception as e:
            raise RuntimeError(f"Error while running ElevenLabsText2SpeechTool: {e}")

[[docs]](https://api.python.langchain.com/en/latest/tools/langchain.tools.eleven_labs.text2speech.ElevenLabsText2SpeechTool.html#langchain.tools.eleven_labs.text2speech.ElevenLabsText2SpeechTool.play)    def play(self, speech_file: str) -> None:
        """Play the text as speech."""
        elevenlabs = _import_elevenlabs()
        with open(speech_file, mode="rb") as f:
            speech = f.read()

        elevenlabs.play(speech)

[[docs]](https://api.python.langchain.com/en/latest/tools/langchain.tools.eleven_labs.text2speech.ElevenLabsText2SpeechTool.html#langchain.tools.eleven_labs.text2speech.ElevenLabsText2SpeechTool.stream_speech)    def stream_speech(self, query: str) -> None:
        """Stream the text as speech as it is generated.
        Play the text in your speakers."""
        elevenlabs = _import_elevenlabs()
        speech_stream = elevenlabs.generate(text=query, model=self.model, stream=True)
        elevenlabs.stream(speech_stream)

The calls to elevenlabs.generate and elevenlabs.play must be extended with the voice parameter.

After this change, the langchain calls run, play, and importantly, stream_speech, could be made analogue the calls from the ElevenLabs API, e.g. seen in this repo:

  prediction = agent_chain.run(input=user_input.text)
    audio = generate(
        text=prediction,
        voice="Adam",
        model="eleven_monolingual_v1"
    )

dosubot[bot] commented 5 months ago

Hi, @agilebean,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue you opened requests the integration of the 'voice' parameter into the Elevenlabs-langchain API to address the random switching of voices with each execution of the 'play()' function. There hasn't been any further activity or comments on the issue.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

agilebean commented 5 months ago

It's still valid and relevant as ever to fix this issue. Without it, the API from ElevenLabs is not usable!

dosubot[bot] commented 5 months ago

🤖

@baskaryan, could you please assist @agilebean with the integration of the 'voice' parameter into the Elevenlabs-langchain API? The issue is still relevant and crucial for the usability of the API. Thank you!

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

adam-clarey commented 4 months ago

I'm also needing this feature

agilebean commented 1 month ago

please reopen, otherwise the Elevenlabs API is unusable!

langchain-ai / langchain