KoljaB / AIVoiceChat

Low latency ai companion voice talk in 60 lines of code using faster_whisper and elevenlabs input streaming
200 stars 38 forks source link

Does it work with a custom ChatBot trained with own DB with Langchain and OpenAI? #1

Open venturaEffect opened 10 months ago

venturaEffect commented 10 months ago

Hi @KoljaB !

First of all congrats on your project.

I'm really interested in using it but need to know if it is possible to use if with my ChatBot created with my custom Data in Langchain. I already use the OpenAI API to make it work but I need to use it with my Chatbot.

Is that possible?

Appreciate

KoljaB commented 10 months ago

Hi @venturaEffect !

Thanks for your kind words on the project! I must admit I'm not too familiar with Langchain. I try to break down your requirement.

Hope this helps, and please don't hesitate to ask if you have any further questions!

Best wishes

venturaEffect commented 10 months ago

Ok, have also no idea if it is possible to get it meanwhile it is generating the text or until it is completely finished.

Will have to test it and ask the Langchain community.

Thanks a lot for your help!

In any case if you like I can provide you my code and you can test it by yourself.

Appreciate

KoljaB commented 10 months ago

Sure, just send it over to my email kolja.beigel@web.de. Would be nice to learn a bit about Langchain while trying to get a streaming text output.

venturaEffect commented 10 months ago

Sent!

KoljaB commented 10 months ago

Looked into that. Langchain seems to only offer a callback for tokens, which would work like this:

from langchain.callbacks.base import BaseCallbackHandler

class MyCallbackHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token, **kwargs) -> None:
        print(token, end="", flush=True) 

model = OpenAI(temperature=0, streaming=True, callbacks=[MyCallbackHandler()])

So we basically have a token stream, but the thing is, elevenlabs expects a generator. I currently have no idea how to bridge that callback system to a generator properly (maybe needs multithreading since the agent.run call is blocking), but this would the key next step.

venturaEffect commented 10 months ago

Wrote to the Langchain Discord server to see if someone came to the solution. To me it seems super useful and the one that gets it will get a lot of views, because it is easy to see that many people will want their custom Chatbot able to have conversations.

Let me know if you come to a solution.

All the best!

KoljaB commented 10 months ago

One thing I also noticed is that the tokens returned from the callback differ from the answer returned from agent.run. Langchain does some processing on the output from the llm. It would be needed to implement that parsing on the text stream too (maybe waiting for "AI: ") before returning any tokens, otherwise Elevenlabs would also speak out that "Thought: " line.

venturaEffect commented 10 months ago

This could make the trick:

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()

generator = llm.stream("Hi there")

it'll stream "AIMessageChunk"'s in this case so you could make it

from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

llm = ChatOpenAI() | StrOutputParser()

generator = llm.stream("Hi there")

What do you think?

KoljaB commented 10 months ago

Yes, with this code it will work. A basic example would look like this:

import os
import openai
from elevenlabs import set_api_key, generate, stream
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

set_api_key(os.environ.get("ELEVENLABS_API_KEY"))
openai.api_key = os.environ['OPENAI_API_KEY']

llm = ChatOpenAI() | StrOutputParser()

generator = llm.stream("Hi there")
audio_stream = generate(text=generator, voice="Nicole", model="eleven_monolingual_v1", stream=True)
stream(audio_stream)